I have a function that I would like to be able to return special values for failure and uninitialized (it returns a pointer on success).
Currently it returns NULL for failure, and -1 for uninitialized, and this seems to work... but I could be cheating the system. IIRC, addresses are always positive, are they not? (although since the compiler is allowing me to set an address to -1, this seems strange).
[update]
Another idea I had (in the event that -1 was risky) is to malloc a char # the global scope, and use that address as a sentinel.
No, addresses aren't always positive - on x86_64, pointers are sign-extended and the address space is clustered symmetrically around 0 (though it is usual for the "negative" addresses to be kernel addresses).
However the point is mostly moot, since C only defines the meaning of < and > pointer comparisons between pointers that are to part of the same object, or one past the end of an array. Pointers to completely different objects cannot be meaningfully compared other than for exact equality, at least in standard C - if (p < NULL) has no well defined semantics.
You should create a dummy object with static storage duration and use its address as your unintialised value:
extern char uninit_sentinel;
#define UNINITIALISED ((void *)&uninit_sentinel)
It's guaranteed to have a single, unique address across your program.
The valid values for a pointer are entirely implementation-dependent, so, yes, a pointer address could be negative.
More importantly, however, consider (as an example of a possible implementation choice) the case where you are on a 32-bit platform with a 32-bit pointer size. Any value that can be represented by that 32-bit value might be a valid pointer. Other than the null pointer, any pointer value might be a valid pointer to an object.
For your specific use case, you should consider returning a status code and perhaps taking the pointer as a parameter to the function.
It's generally a bad design to try to multiplex special values onto a return value... you're trying to do too much with a single value. It would be cleaner to return your "success pointer" via argument, rather than the return value. That leaves lots of non-conflicting space in the return value for all of the conditions you want to describe:
int SomeFunction(SomeType **p)
{
*p = NULL;
if (/* check for uninitialized ... */)
return UNINITIALIZED;
if (/* check for failure ... */)
return FAILURE;
*p = yourValue;
return SUCCESS;
}
You should also do typical argument checking (ensure that 'p' isn't NULL).
The C language does not define the notion of "negativity" for pointers. The property of "being negative" is a chiefly arithmetical one, not in any way applicable to values of pointer type.
If you have a pointer-returning function, then you cannot meaningfully return the value of -1 from that function. In C language integral values (other than zero) are not implicitly convertible to pointer types. An attempt to return -1 from a pointer-returning function is an immediate constraint violation that will result in diagnostic message. In short, it is an error. If your compiler allows it, it simply means that it doesn't enforce that constraint too strictly (most of the time they do it for compatibility with pre-standard code).
If you force the value of -1 to pointer type by an explicit cast, the result of the cast will be implementation-defined. The language itself makes no guarantees about it. It might easily prove to be the same as some other, valid pointer value.
If you want to create a reserved pointer value, there no need to malloc anything. You can simple declare a global variable of the desired type and use its address as the reserved value. It is guaranteed to be unique.
Pointers can be negative like an unsigned integer can be negative. That is, sure, in a two's-complement interpretation, you could interpret the numerical value to be negative because the most-significant-bit is on.
What's the difference between failure and unitialized. If unitialized is not another kind of failure, then you probably want to redesign the interface to separate these two conditions.
Probably the best way to do this is to return the result through a parameter, so the return value only indicates an error. For example where you would write:
void* func();
void* result=func();
if (result==0)
/* handle error */
else if (result==-1)
/* unitialized */
else
/* initialized */
Change this to
// sets the *a to the returned object
// *a will be null if the object has not been initialized
// returns true on success, false otherwise
int func(void** a);
void* result;
if (func(&result)){
/* handle error */
return;
}
/*do real stuff now*/
if (!result){
/* initialize */
}
/* continue using the result now that it's been initialized */
#James is correct, of course, but I'd like to add that pointers don't always represent absolute memory addresses, which theoretically would always be positive. Pointers also represent relative addresses to some point in memory, often a stack or frame pointer, and those can be both positive and negative.
So your best bet is to have your function accept a pointer to a pointer as a parameter and fill that pointer with a valid pointer value on success while returning a result code from the actual function.
James answer is probably correct, but of course describes an implementation choice, not a choice that you can make.
Personally, I think addresses are "intuitively" unsigned. Finding a pointer that compares as less-than a null pointer would seem wrong. But ~0 and -1, for the same integer type, give the same value. If it's intuitively unsigned, ~0 may make a more intuitive special-case value - I use it for error-case unsigned ints quite a lot. It's not really different (zero is an int by default, so ~0 is -1 until you cast it) but it looks different.
Pointers on 32-bit systems can use all 32 bits BTW, though -1 or ~0 is an extremely unlikely pointer to occur for a genuine allocation in practice. There are also platform-specific rules - for example on 32-bit Windows, a process can only have a 2GB address space, and there's a lot of code around that encodes some kind of flag into the top bit of a pointer (e.g. for balancing flags in balanced binary trees).
Actually, (at least on x86), the NULL-pointer exception is generated not only by dereferencing the NULL pointer, but by a larger range of addresses (eg, first 65kb). This helps catching such errors as
int* x = NULL;
x[10] = 1;
So, there are more addresses that are garanteed to generate the NULL pointer exception when dereferenced.
Now consider this code (made compilable for AndreyT):
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#define ERR_NOT_ENOUGH_MEM (int)NULL
#define ERR_NEGATIVE (int)NULL + 1
#define ERR_NOT_DIGIT (int)NULL + 2
char* fn(int i){
if (i < 0)
return (char*)ERR_NEGATIVE;
if (i >= 10)
return (char*)ERR_NOT_DIGIT;
char* rez = (char*)malloc(strlen("Hello World ")+sizeof(char)*2);
if (rez)
sprintf(rez, "Hello World %d", i);
return rez;
};
int main(){
char* rez = fn(3);
switch((int)rez){
case ERR_NOT_ENOUGH_MEM: printf("Not enough memory!\n"); break;
case ERR_NEGATIVE: printf("The parameter was negative\n"); break;
case ERR_NOT_DIGIT: printf("The parameter is not a digit\n"); break;
default: printf("we received %s\n", rez);
};
return 0;
};
this could be useful in some cases.
It won't work on some Harvard architectures, but will work on von Neumann ones.
Do not use malloc for this purpose. It might keep unnecessary memory tied up (if a lot of memory is already in use when malloc gets called and the sentinel gets allocated at a high address, for example) and it confuses memory debuggers/leak detectors. Instead simply return a pointer to a local static const char object. This pointer will never compare equal to any pointer the program could obtain in any other way, and it only wastes one byte of bss.
You don't need to care about the signness of a pointer, because it's implementation defined. The real question here is "how to return special values from a function returning pointer?" which I've explained in detail in my answer to the question Pointer address span on various platforms
In summary, the all-one bit pattern (-1) is (almost) always safe, because it's already at the end of the spectrum and data cannot be stored wrapped around to the first address, and the malloc family never returns -1. In fact this value is even returned by many Linux system calls and Win32 APIs to indicate another state for the pointer. So if you need just failure and uninitialized then it's a good choice
But you can return far more error states by utilizing the fact that variables must be aligned properly (unless you specified some other options). For example in a pointer to int32_t the low 2 bits are always zero which means only ¹⁄₄ of the possible values are valid addresses, leaving all of the remaining bit patterns for you to use. So a simple solution would be just checking the lowest bit
int* result = func();
if (!result)
error_happened();
else if ((uintptr_t)result & 1)
uninitialized();
In this case you can return both a valid pointer and some additional data at the same time
You can also use the high bits for storing data in 64-bit systems. On ARM there's a flag that tells the CPU to ignore the high bits in the addresses. On x86 there isn't a similar thing but you can still use those bits as long as you make it canonical before dereferencing. See Using the extra 16 bits in 64-bit pointers
See also
Is ((void *) -1) a valid address?
NULL is the only valid error return in this case, this is true anytime an unsigned value such as a pointer is returned. It may be true that in some cases pointers will not be large enough to use the sign bit as a data bit, however since pointers are controlled by the OS not the program I would not rely on this behavior.
Remember that a pointer is basically a 32-bit value; whether or not this is a possible negative or always positive number is just a matter of interpretation (i.e.) whether the 32nd bit is interpreted as the sign bit or as a data bit. So if you interpreted 0xFFFFFFF as a signed number it would be -1, if you interpreted it as an unsigned number it would be 4294967295. Technically, it is unlikely that a pointer would ever be this large, but this case should be considered anyway.
As far as an alternative you could use an additional out parameter (returning NULL for all failures), however this would require clients to create and pass a value even if they don't need to distinguish between specific errors.
Another alternative would be to use the GetLastError/SetLastError mechanism to provide additional error information (This would be specific to Windows, don't know if that is an issue or not), or to throw an exception on error instead.
Positive or negative is not a meaningful facet of pointer type. They pertain to signed integer including signed char, short, int etc.
People talk about negative pointer mostly in a situation that treats pointer's machine representation as an integer type. e.g. reinterpret_cast<intptr_t>(ptr). In this case, they are actually talking about the cast integer, not the pointer itself.
In some scenario I think pointer is inherently unsigned, we talk about address in terms below or above. 0xFFFF.FFFF is above 0x0AAAA.0000, which is intuitively for human beings. Although 0xFFFF.FFFF is actually a "negative" while 0x0AAA.0000 is positive.
But in other scenarios such as pointer subtraction (ptr1 - ptr2) that results in a signed value whose type is ptrdiff_t, it's inconsistent when you compare with integer's subtraction, signed_int_a - signed_int_b results in a signed int type, unsigned_int_a - unsigned_int_b produces an unsigned type. But for pointer subtraction, it produces a signed type, because the semantic is the distance between two pointers, the unit is number of elements.
In summary I suggest treating pointer type as standalone type, every type has it's set of operation on it. For pointers (excluding function pointer, member function pointer, and void *):
List item
+, +=
ptr + any_integer_type
-, -=
ptr - any_integer_type
ptr1 - ptr2
++ both prefix and postfix
-- both prefix and postfix
Note there are no / * % operations for pointer. That's also supported that pointer should be treated as a standalone type, instead of "A type similar to int" or "A type whose underlying type is int so it should looks like int".
Related
I am getting the below error in linux arm architecture when trying to compare void pointer addr>(void *)0XFFFE00000000 .Here addr is of type void pointer error: ordered comparison of pointer with null pointer [-Werror=extra]
This is happening only in Linux arm architecture,in other architecture it is working fine
addr>(void *)0XFFFE00000000
How to solve this?
Probably the integer literal is overflowing into 32 bits, which becomes 0 or NULL.
But you shouldn't go around comparing random (void) pointers for being greater than some random integer, anyway. Cast the pointer to uintptr_t, and make sure the literal is of a suitable type too, then it starts becoming more likely to work. There doesn't seem to be a UINTPTR_C() macro, but perhaps it makes sense to use UINTMAX_C()?
Of course, if your unspecified "ARM" is 32-bit, then the address is way out of bounds and probably larger than the pointers will be ... quite confusing.
Comparing the ordering of two pointers doesn’t make sense except when both pointers point into the same array (and even then it’s questionable at best; generally you’d use inequality instead of ordering).
Since your actual problem is
my signal's address higher bytes is getting overwritten by 0XFFFE0
The first order of business is to find out why this is happening and whether it can be prevented: If an address gets overridden this indicates that there’s something very wrong with the code, and that you should fix the root cause rather than the symptoms.
That said, if all that’s required is to zero out the higher, overridden bytes of your pointer, the portable way is to convert the pointer to an integer and manipulate that, rather than manipulating the pointer directly:
const uintptr_t mask_bytes = 0xFFFE;
const int mask_width = 4 * CHAR_BIT; // ?!
const uintptr_t mask = mask_bytes << ((sizeof(uintptr_t) * CHAR_BIT) - mask_width);
uintptr_t uaddr = (uintptr_t) addr;
if ((uaddr & mask) == mask) {
addr = (void*) (uaddr & ~ mask);
}
… substitute void* with your actual pointer type.
What's special about integer pointers (but any pointer type really) is that you can assign to them NULL a non-integer sort of value; whereas an integer has to, no matter what, store an integer, be it a positive or negative one (correct me if I'm wrong), and no NON-integer values.
Could you then make use of this 'special' feature of pointers (that of being a variable that can store integers and a NON-integer value: NULL) to at the same time store integer values (via their literal actual value, like instead of an address, it would store a signed/unsigned integer) and be a sort of boolean -- where a NULL pointer would signify false & a valid pointer (i.e one that is holding an integer) (but not really valid ofc, just one that isn't NULL) would signify true.
Note: The pointer is absolutely never used to access a memory location, just to store an int.
(Ofc this is just for a particular use case, not that you would do this in regular code)
(If you really want to know) I'm trying to make a recursive function, and want the return value of the function to return an integer but also want to keep track of a condition, so I also want it to return a boolean, but you obviously can only return a single argument... so could passing an integer pointer, a variable that can do both at once, be the solution ?
I thought of other ways (stucts, arrays.. ), but curious if doing it w/ an integer pointer could be a plausible way.
There’s nothing special about a pointer with regard to NULL. On modern Intel based implementations not running in 8086 real mode, a pointer is just an unsigned integer, and NULL is 0. You can’t store something “extra” in that way.
If you need to return two values from your function, create a struct containing an int and a bool and have your function return that.
Is it possible to store a signed integer in an integer pointer (int *)?
Maybe. It might "work". Even the assignment, without de-referencing the pointer, may cause the program to stop. Even with a successful conversion, information may be lost.
An integer may be converted to any pointer type. Except as previously specified, the
result is implementation-defined, might not be correctly aligned, might not point to an
entity of the referenced type, and might be a trap representation C11dr §6.3.2.3 5
// Sample implementation
int i = -rand();
printf("i: %d\n", i);
int *iptr = (int *) i; // implementation-defined
printf("iptr: %p\n", (void*) iptr);
What's special about integer pointers (?)
They are correctly valued to aligned on de-referencing to point to the specific integer type. They may exist in an address space that is not suitable for some other types. C even allows for a int * to be narrower than a void *. (Have not seen a machine take advantage of that in some time though.)
.. an integer has to, no matter what, store an integer ...
Counter examples: Code can store a _Bool in a integer and be recovered unchanged. void * can be save in a integer of type (u)intptr_t and be recovered with an equivalent value.
A integer of the optional type (u)intptr_t can convert to a void* and maintain pointer equivalence. This is not necessarily true with direct casting of other non-character pointers or of function pointers. This is not not necessarily true with other integer types.
some_type_t *some_valid_object_pointer = ...;
intptr_t i = (intptr_t)(void*) some_valid_object_pointer;
some_type_t *some_valid_object_pointer2 = (some_type_t*)(void*)i;
assert(some_valid_object_pointer == some_valid_object_pointer2);
Could you then make use of this 'special' feature of pointers
Not certainly. OP's implementation may work on selective platform, but is it lacks specified behavior and portability.
Maybe I'm mistaken, but why not just use a bool * with stdbool.h?
The only benefit of using NULL and something else would be, that you don't have
to malloc/free the bool * but for the cost of bad semantics and the risk of misuse.
Here is an example, how I understand your proposal:
static void foo(int *_dontuse) {
if (_dontuse != NULL) {
/* do stuff */
} else {
/* do stuff */
}
}
And _dontuse is only used internally and you never malloc/free it, but let it point to arbitrary memory locations. So potentially risky.
Better idea:
#include stdbool.h
struct Sometype {
bool done;
/* other members */
};
static void internalfoo(struct Sometype *data) {
/* use data, maybe store bool inside struct for ifs */
if (data->done) return;
/* do something */
data->done = true; /* need to set it to true, to terminate */
internalfoo(data);
}
void foo() {
struct Sometype data;
data.done = false;
internalfoo(&data);
/* do something with result */
}
Or try to implement this with dynamic programming.
Yes, you can [if you are working on 32-bit or 64-bit system]. In those cases int size is less or equal to int* size, so there is no any problem of doing that.
NULL is just pointer to RAM segment that you are not able to access (this segment begins with address 0 and continues to some size). That's it, it's just a numerical value of memory segment.
I have a callback function written in C that runs on a server and MUST be crash proof. That is, if expecting an integer and is passed a character string pointer, I must internal to the function determine that, and prevent getting Segmentation faults when trying to do something not allowed on the incorrect parameter type.
The function protoype is:
void callback_function(parameter_type a, const b);
and 'a' is supposed to tell me, via enum, whether 'b' is an integer or a character string pointer.
If the calling function programmer makes a mistake, and tells me in 'a' that 'b' is an integer, and 'b' is really a character string pointer, then how do I determine that without crashing the callback function code. This runs on a server and must keep going if the caller function made a mistake.
The code has to be in C, and be portable so C library extensions would not be possible. The compiler is: gcc v4.8.2
The sizeof an integer on the platform is 4, and so is the length of a character pointer.
An integer could have the same value, numerically, as a character pointer, and vice versa.
If I think I get a character pointer and its not, when I try to find the content of that, I of course get a Segmentation Fault.
If I write a signal handler to handle the fault, how do I now "clear" the signal, and resume execution at a sane place?
Did I mention that 'b' is a union defined as:
union param_2 {
char * c;
int i;
} param_to_be_passed;
I think that's about it.
Thank You for your answers.
That is not possible.
There's no way to "look" at at pointer and determine if it's valid to de-reference, except for NULL-checking it of course.
Other than that, there's no magic way to know if a pointer points at character data, an integer, a function, or anything else.
You are looking for a hack.
What ever proposal comes, do not use such things in production.
If late binding is needed take a different, a fail-safe approach.
If you're writing code for an embedded device, you would expect that all variables would reside in RAM. For example, you might have 128 kB of RAM from addresses 0x20000000 to 0x20020000. If you were passed a pointer to a memory address without this range, in regard to c, that would be another way to determine something was wrong, in addition to checking for a NULL address.
if((a == STRING) && ((b.c == NULL) || (b.c < 0x20000000) || (b.c > 0x20020000)))
return ERROR;
If you're working in a multithreaded environment, you may be able to take this a step further and require all addresses passed to callback_function come from a certain thread's memory space (stack).
If the caller says in a that the result is int, there is no great risk of crash, because:
in your case both types have the same length (be aware that this is NOT GUARANTEED TO BE PORTABLE!)
The C standards says (ISO - sect.6.3.2.3): "Any pointer type may be converted to an integer type. Except as previously specified, the result is implementation-defined. If the result cannot be represented in the integer type, the behavior is undefined.
But fortunately, most 32 bit values will be a valid integer.
Keep in mind that in the worst case, the value could be meaningless. So you it's up to you to avoid the crash, by systematically verifying consistency of the value (for example do bound controls if you use the integer to adress some array elements)
If the caller says in "a" that the result is a pointer but provides an int, it's much more difficult to avoid a crash in a portable manner.
The standard ISO says: An integer may be converted to any pointer type. Except as previously specified, the result is implementation-defined, might not be correctly aligned, might not point to an entity of the referenced type, and might be a trap representation.
In practice most of these errors are trapped by memory access exceptions at a very low system level. The behaviour being implementation defined, there's no portable way of doing it.
NOTE: This doesn't actually attempt to make the function "crash-proof", because I suspect that thats not possible.
If you are allowed to change the API, one option may be to combine the union only use an api for accessing the type.
typedef enum Type { STRING, INT } Type;
typedef struct StringOrInt {
Type type;
union { int i; char* s } value;
} StringOrInt;
void soi_set_int(StringOrInt* v, int i) {
v->type = INT;
v->value.i = i;
}
void soi_set_string(StringOrInt* v, char* s) {
v->type = STRING;
v->value.s = s;
}
Type soi_get_type(StringOrInt cosnt* v) {
return v->type;
}
int soi_get_int(StringOrInt const* v) {
assert(v->type == INT);
return v->value.i;
}
char* soi_get_string(StringOrInt const* v) {
assert(v->type == STRING);
return v->value.s;
}
While this doesn't actually make it crash proof, users of the API will find it more convenient to use the API than change the members by hand, reducing the errors significantly.
Run-time type checking in C is effectively impossible.
The burden is on the caller to pass the data correctly; there's no (good, standard, portable) way for you to determine whether b contains data of the correct type (that is, that the caller didn't pass you a pointer value as an integer or vice versa).
The only suggestion I can make is to create two separate callbacks, one of which takes an int and the other a char *, and put the burden on the compiler to do type checking at compile time.
This issue bothered me for a while. I never saw a different definition of NULL, it's always
#define NULL ((void *) 0)
is there any architecture where NULL is defined diferently, and if so, why the compiler don't declare this for us ?
C 2011 Standard, online draft
6.3.2.3 Pointers
...
3 An integer constant expression with the value 0, or such an expression cast to type
void *, is called a null pointer constant.66) If a null pointer constant is converted to a
pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal to a pointer to any object or function.
66) The macro NULL is defined in <stddef.h> (and other headers) as a null pointer constant; see 7.19.
The macro NULL is always defined as a zero-valued constant expression; it can be a naked 0, or 0 cast to void *, or some other integral expression that evaluates to 0. As far as your source code is concerned, NULL will always evaluate to 0.
Once the code has been translated, any occurrence of the null pointer constant (0, NULL, etc.) will be replaced with whatever the underlying architecture uses for a null pointer, which may or may not be 0-valued.
WhozCraig wrote these comments to a now-deleted answer, but it could be promoted to a full answer (and that's what I've done here). He notes:
Interesting note: AS/400 is a very unique platform where any non-valid pointer is considered equivalent to NULL. The mechanics which they employ to do this are simply amazing. "Valid" in this sense is any 128-bit pointer (the platform uses a 128bit linear address space for everything) containing a "value" obtained by a known-trusted instruction set. Hard as it is to believe, int *p = (int *)1; if (p) { printf("foo"); } will not print "foo" on that platform. The value assigned to p is not source-trusted, and thus, considered "invalid" and thereby equivalent to NULL.
It's frankly startling how it works. Each 16-byte paragraph in the mapped virtual address space of a process has a corresponding "bit" in a process-wide bitmap. All pointers must reside on one of these paragraph boundaries. If the bit is "lit", the corresponding pointer was stored from a trusted source, otherwise it is invalid and equivalent to NULL. Calls to malloc, pointer math, etc, are all scrutinized in determining whether that bit gets lit or not. And as you can imagine, putting pointers in structures brings a whole new world of hurt on the idea of structure packing.
This is marked community-wiki (it's not my answer — I shouldn't get the credit) but it can be deleted if WhozCraig writes his own answer.
What this shows is that there are real platforms with interesting pointer properties.
There have been platforms where #define NULL ((void *)0) is not the usual definition; on some platforms it can be just 0, on others, 0L or 0ULL or other appropriate values as long as the compiler understands it. C++ does not like ((void *)0) as a definition; systems where the headers interwork with C++ may well not use the void pointer version.
I learned C on a machine where the representation for the char * address for a given memory location was different from the int * address for the same memory location. This was in the days before void *, but it meant that you had to have malloc() properly declared (char *malloc(); — no prototypes either), and you had to explicitly cast the return value to the correct type or you got core dumps. Be grateful for the C standard (though the machine in question, an ICL Perq — badged hardware from Three Rivers — was largely superseded by the time the standard was defined).
In the dark ages before ANSI-C the old K&R C had many different implementations on hardware that would be considered bizarre today. This was before the days of VM when machines were very "real". Addresses of zero were not only just fine on these machines, an address of zero could be popular... I think it was CDC that sometimes stored the system constant of zero at zero (and did strange things happen if this was set non-zero).
if ( NULL != ptr ) /* like this */
if ( ptr ) /* never like this */
The trick was finding address you could safely use to indicate "nothing" as storing things at the end of memory was also popular, which ruled out 0xFFFF on some architectures. And these architectures tended to use word addresses rather than byte addresses.
I don't know the answer to this but I'm making a guess. In C you usually do a lot of mallocs, and consequently many tests for returned pointers. Since malloc returns void *, and especially (void *)0 upon failure, NULL is a natrual thing to define in order to test malloc success. Since this is so essential, other library functions use NULL (or (void *)0) too, like fopen. Actually, everything that returns a pointer.
Hence here is no reason the define this at the language level - it's just a special pointer value that can be returned by so many functions.
I saw some usage of (void*) in printf().
If I want to print a variable's address, can I do it like this:
int a = 19;
printf("%d", &a);
I think, &a is a's address which is just an integer, right?
Many articles I read use something like this:
printf("%p", (void*)&a);
What does %p stand for? (A pointer?)
Why use (void*)? Can't I use (int)&a instead?
Pointers are not numbers. They are often internally represented that way, but they are conceptually distinct.
void* is designed to be a generic pointer type. Any pointer value (other than a function pointer) may be converted to void* and back again without loss of information. This typically means that void* is at least as big as other pointer types.
printfs "%p" format requires an argument of type void*. That's why an int* should be cast to void* in that context. (There's no implicit conversion because it's a variadic function; there's no declared parameter, so the compiler doesn't know what to convert it to.)
Sloppy practices like printing pointers with "%d", or passing an int* to printf with a "%p" format, are things that you can probably get away with on most current systems, but they render your code non-portable. (Note that it's common on 64-bit systems for void* and int to be different sizes, so printing pointers with %d" is really non-portable, not just theoretically.)
Incidentally, the output format for "%p" is implementation-defined. Hexadecimal is common, (in upper or lower case, with or without a leading "0x" or "0X"), but it's not the only possibility. All you can count on is that, assuming a reasonable implementation, it will be a reasonable way to represent a pointer value in human-readable form (and that scanf will understand the output of printf).
The article you read is entirely correct. The correct way to print an int* value is
printf("%p", (void*)&a);
Don't take the lazy way out; it's not at all difficult to get it right.
Suggested reading: Section 4 of the comp.lang.c FAQ. (Further suggested reading: All the other sections.
EDIT:
In response to Alcott's question:
There is still one thing I don't quite understand. int a = 10; int *p = &a;, so p's value is a's address in mem, right? If right, then p's value will range from 0 to 2^32-1 (if cpu is 32-bit), and an integer is 4-byte on 32-bit OS, right? then What's the difference between the p's value and an integer? Can p's value go out of the range?
The difference is that they're of different types.
Assume a system on which int, int*, void*, and float are all 32 bits (this is typical for current 32-bit systems). Does the fact that float is 32 bits imply that its range is 0 to 232-1? Or -231 to 231-1? Certainly not; the range of float (assuming IEEE representation) is approximately -3.40282e+38 to +3.40282e+38, with widely varying resolution across the range, plus exotic values like negative zero, subnormalized numbers, denormalized numbers, infinities, and NaNs (Not-a-Number). int and float are both 32 bits, and you can take the 32 bits of a float object and treat it as an int representation, but the result won't have any straightforward relationship to the value of the float. The second low-order bit of an int, for example, has a specific meaning; it contributes 0 to the value if it's 0, and 2 to the value if it's 1; the corresponding bit of a float has a meaning, but it's quite different (it contributes a value that depends on the value of the exponent).
The situation with pointers is quite similar. A pointer value has a meaning: it's the address of some object (or any of several other things, but we'll set that aside for now). On most current systems, interpreting the bits of a pointer object as if it were an integer gives you something that makes sense on the machine level. But the language itself does not guarantee, or even hint, that that's the case.
Pointers are not numbers.
A concrete example: some years ago, I ran across some code that tried to compute the difference in bytes between two addresses by casting to integers. It was something like this:
unsigned char *p0;
unsigned char *p1;
long difference = (unsigned long)p1 - (unsigned long)p0;
If you assume that pointers are just numbers, representing addresses in a linear monolithic address space, then this code makes sense. But that assumption is not supported by the language. And in fact, there was a system on which that code was intended to run (the Cray T90) on which it simply would not have worked. The T90 had 64-bit pointers pointing to 64-bit words. Byte pointers were synthesized in software by storing an offset in the 3 high-order bits of a pointer object. Subtracting two pointers in the above manner, if they both had 0 offsets, would give you the number of words, not bytes, between the addresses. And if they had non-0 offsets, it would give you meaningless garbage. (Conversion from a pointer to an integer would just copy the bits; it could have done the work to give you a meaningful byte index, but it didn't.)
The solution was simple: drop the casts and use pointer arithmetic:
long difference = p1 - p0;
Other addressing schemes are possible. For example, an address might consist of a descriptor that (perhaps indirectly) references a block of memory, plus an offset within that block.
You can assume that addresses are just numbers, that the address space is linear and monolithic, that all pointers are the same size and have the same representation, that a pointer can be safely converted to int, or to long, and back again without loss of information. And the code you write based on those assumptions will probably work on most current systems. But it's entirely possible that some future systems will again use a different memory model, and your code will break.
If you avoid making any assumptions beyond what the language actually guarantees, your code will be far more future-proof. And even leaving portability issues aside, it will probably be cleaner.
So much insanity present here...
%p is generally the correct format specifier to use if you just want to print out a representation of the pointer. Never, ever use %d.
The length of an int and the length of a pointer (void* or otherwise) have no relationship. Most data models on i386 just happen to have 32-bit ints AND 32-bit pointers -- other platforms, including x86-64, are not the same! (This is also historically known as "all the world's a VAX syndrome".) http://en.wikipedia.org/wiki/64-bit#64-bit_data_models
If for some reason you want to hold a memory address in an integral variable, use the right types! intptr_t and uintptr_t. They're in stdint.h. See http://en.wikipedia.org/wiki/Stdint.h#Integers_wide_enough_to_hold_pointers
In C void * is an un-typed pointer. void does not mean void... it means anything. Thus casting to void * would be the same as casting to "pointer" in another language.
Using (int *)&a should work too... but the stylistic point of saying (void *) is to say -- I don't care about the type -- just that it is a pointer.
Note: It is possible for an implementation of C to cause this construct to fail and still meet the requirements of the standards. I don't know of any such implementations, but it is possible.
Although it the vast majority of C implementations store pointers to all kinds of objects using the same representation, the C Standard does not require that all implementations do so, nor does it even provide any means by which a program which would exploit commonality of representations could test whether an implementation follows the common practice and refuse to run if an implementation doesn't.
If on some particular platform, an int* held a word address, while both char* and void* combine a word address with a word that identifies a byte within a word, passing an int* to a function that is expecting to retrieve a variadic argument of type char* or void* would result in that function trying to fetch more data from the stack (a word address plus the supplemental word) than had been pushed (just the word address). This could cause the system to malfunction in unpredictable ways.
Many compilers for commonplace platforms that use the same representation for all pointers will process an action which passes a non-void pointer precisely the same way as they would process an action which casts the pointer to void* before passing it. They thus have no reason to care about whether the pointer type that is passed as a variadic argument will precisely match the pointer type expected by the recipient. Although the Standard could have specified that such implementations which would have no reason to care about pointer types should behave as though the pointers were cast to void*, the authors of C89 Standard avoided describing anything which wouldn't be common to all conforming compilers. The Standard's terminology for a construct that 99% of implementations should process identically, but 1% would might process unpredictably, is "Undefined Behavior". Implementations may, and often should, extend the semantics of the language by specifying how they will treat such constructs, but that's a Quality of Implementation issue outside the Standard's jurisdiction.