The book Understanding and Using C Pointers, by Richard Reese says:
The null concept is an abstraction supported by the null pointer
constant. This constant may or may not be a constant zero. A C
programmer need not be concerned with their actual internal
representation.
My question is, since "this constant may or may not be a constant zero," is it safe for me to do things like the below in my code:
int *ptr = NULL;
// Some code which probably sets ptr to a valid memory address
if(!ptr)
{
ERROR();
}
If NULL is not 0, there is a chance that the if clause will evaluate to true.
Is it safe to assume that the NULL constant is zero?
NULL will compare equal to 0.
NULL is very commonly a zero bit pattern. It is possible for NULL to be a non-zero bit pattern - but not seen these days.
OP is mixing as least 4 things: NULL, null pointer constant, null pointer, comparing a null pointer to 0. C does not define a NULL constant.
NULL
NULL is a macro "which expands to an implementation-defined null
pointer constant" C17dr § 7.19 3
null pointer constant
An integer constant expression with the value 0, or such an expression
cast to type void *, is called a null pointer constant. C17dr § §
6.3.2.3 3
Thus the type of a null pointer constant may be int, unsigned, long, ... or void * .
When an integer constant expression1, the null pointer constant value is 0. As a pointer like ((void *)0), its value/encoding is not specified. It ubiquitously does have the bit pattern of zeros, but is not specified so.
There may be many null pointer constants. They all compare equal to each other.
Note: the size of a null pointer constant, when it is an integer, may differ from the size of an object pointer. This size difference is often avoided by appending a L or two suffix as needed.
null pointer
If a null pointer constant is converted to a pointer type, the
resulting pointer, called a null pointer, is guaranteed to compare
unequal to a pointer to any object or function. C17dr § § 6.3.2.3 3
Conversion of a null pointer to another pointer type yields a null
pointer of that type. Any two null pointers shall compare equal. C17dr
§ § 6.3.2.3 4
The type of null pointer is some pointer, either an object pointer like int *, char * or function pointer like int (*)(int, int) or void *.
The value of a null pointer is not specified. It ubiquitously does have the bit pattern of zeros, but is not specified so.
All null pointer compare as equal, regardless of their encoding.
comparing a null pointer to 0
if(!ptr) is the same as if(!(ptr != 0)). When the pointer ptr, which is a null pointer, is compared to 0, the zero is converted to a pointer, a null pointer of the same type: int *. These 2 null pointers, which could have different bit patterns, compare as equal.
So when it is not safe to assume that the NULL constant is zero?
NULL may be a ((void*)0) and its bit pattern may differ from zeros. It does compare equal to 0 as above regardless of its encoding. Recall pointer compares have been discussed, not integer compares. Converting NULL to an integer may not result in an integer value of 0 even if ((void*)0) was all zero bits.
printf("%ju\n", (uintmax_t)(uintptr_t)NULL); // Possible not 0
Notice this is converting a pointer to an integer, not the case of if(!ptr) where a 0 was converted to a pointer.
The C spec embraces many old ways of doing things and is open to novel new ones. I have never came across an implementation where NULL was not an all zeros bit pattern. Given much code exist that assumes NULL is all zero bits, I suspect only old obscure implementations ever used a non-zero bit-pattern NULL and that NULL can be all but certain to be an all zero bit pattern.
1 The null pointer constant is 1) an integer or 2) a void*. "When an integer ..." refers to the first case, not a cast or conversion of the second case as in (int)((void*)0).
if(!ptr) is a safe way to check for a NULL pointer.
The expression !x is exactly equivalent to 0 == x. The constant 0 is a NULL pointer constant, and any pointer may be compared for equality against a NULL pointer constant.
This holds true even if the representation of a null pointer is not "all bits 0".
Section 6.5.3.3p5 of the C standard regarding the ! operator states:
The result of the logical negation operator ! is 0 if the
value of its operand compares unequal to 0, 1 if the value of its
operand compares equal to 0. The result has type int. The
expression !E is equivalent to (0==E).
And section 6.3.2.3p3 regarding pointer conversions states:
An integer constant expression with the value 0, or such an
expression cast to type void *, is called a null pointer
constant. If a null pointer constant is converted to a pointer type,
the resulting pointer, called a null pointer, is guaranteed to compare
unequal to a pointer to any object or function.
chux has written a good, detailed answer, but regarding that book specifically, I'd be sceptic about its quality:
This constant may or may not be a constant zero
This is wrong, it must always be a zero or a zero cast to a void*. The definition of a null pointer constant is found in C17 6.3.2.3/3:
An integer constant expression with the value 0, or such an expression cast to type
void *, is called a null pointer constant. If a null pointer constant is converted to a
pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal
to a pointer to any object or function.
This means that all integer constant expressions like 0, 0L, 0u, 0x0, '\0' etc are null pointer constant. If any of them is cast to a void*, it is also a null pointer constant.
A C programmer need not be concerned with their actual internal representation.
The author is obviously mixing up the two formal terms null pointer constant and null pointer. A programmer do not need to concern themselves with the internal representation of a null pointer. They do need to know what makes a valid null pointer constant though. The safest, most readable way being to use the NULL macro, which is guaranteed to be a null pointer constant.
So regarding your question "is it safe for me to do things like the below in my code" - yes it is perfectly safe to do !ptr to check for a null pointer, even though ptr==NULL is more readable code.
Related
So, I had an argument with my professor earlier defending that NULL is not a pointer, but he kept on insisting that it is because there is such a thing as NULL pointer. So, here I am now a little bit confused if NULL is really a pointer or not
I already tried search over the internet but couldn't find any answer, so my last resort is here
In C, NULL is a macro that expands to a null pointer constant.
7.19p3
The macros are
NULL which expands to an implementation-defined null pointer constant;
...
A null pointer constant is an integer constant expression with the value 0 (
e.g., 0, 1-1, 42*0LL, etc.) or such an expression cast to (void*).
6.3.2.3p3
An integer constant expression with the value 0, or such an expression
cast to type void *, is called a null pointer constant.66) If a null
pointer constant is converted to a pointer type, the resulting
pointer, called a null pointer, is guaranteed to compare unequal to a
pointer to any object or function.
Most common C implementations define NULL to be 0, 0L, or ((void*)0).
So you are correct. NULL need not be a pointer.
(IIRC, C++ doesn't even allow the (void*) cast in NULL, meaning NULL in C++ always has integer type. Because of that and because void* pointers
do not compare with regular pointers so readily in C++, C++>=11 now has a special nullptr keyword.)
NULL itself is not a pointer, it is a macro that can be used to initialize a pointer to the null pointer value of its type. When compared to a pointer, it compares equal if the pointer is a null pointer and unequal if the pointer is a valid pointer to an object of its type.
There is no semantic difference between char *p = 0; and char *p = NULL; but the latter is more explicit and using NULL instead of 0 is more informative in circumstances where the other operand is not obviously a pointer or if comparing to an integer looks like a type mismatch:
FILE *fp = fopen("myfile", "r");
if (fp == NULL) {
/* report the error */
}
Similarly, there is no semantical difference in C between '\0' and 0, they both are int constants. The first is the null byte, the second the null value. Using 0, '\0' and NULL wisely may seem futile but makes code more readable by other programmers and oneself too.
The confusion may come from misspelling or mishearing the null pointer as the NULL pointer. The C Standard was carefully proof read to only use null pointer and refer to NULL only as the macro NULL.
Note however that one the accepted definitions of NULL, #define NULL ((void*)0) makes NULL a null pointer to void.
There's nothing about the idea of a 'pointer to address 0' that's a problem.
The rule is that you're disallowed from derefencing it... it's allowed to exist, and if created will meet any criteria for "pointerhood" I can think of.
Just because it's not meaningfully a pointer to something...
Literally, does (char *) 0 mean a pointer to some location that contains a zero? Does the system create such an address with value 0 for each such declaration?
No, it's a cast of 0 to type char *. That is, a null pointer. A 0 in any pointer context refers to the null pointer constant.
What exactly it points to doesn't matter - dereferencing it would cause undefined behaviour.
For more, check out the C FAQ Part 5: Null Pointers.
It is null pointer value of type char *.
From the C++ Standard
A null pointer constant can be converted to a pointer type; the result
is the null pointer value of that type
And from the C Standard
3 An integer constant expression with the value 0, or such an
expression cast to type void *, is called a null pointer constant.66)
If a null pointer constant is converted to a pointer type, the
resulting pointer, called a null pointer, is guaranteed to compare
unequal to a pointer to any object or function.
4 Conversion of a null pointer to another pointer type yields a null
pointer of that type. Any two null pointers shall compare equal.
It is not a pointer that points to a locarion that contains 0. So the system creates nothing. As it is written in the C Standard the null pointer "is guaranteed to compare unequal to a pointer to any object or function". So it is used to determine whether a pointer points to some object or function.
its an explicit cast to a null pointer. i believe there is a #define macro for it allowing you to just write NULL
This could be thought of as an extension to this question (I'm interested in C only, but adding C++ to complete the extension)
The C11 standard at 6.3.2.3.3 says:
An integer constant expression with the value 0, or such an expression cast to type void *, is called a null pointer constant.
What my take on this personally is that 0 and (void *)0 represent the null pointer, whose integer value may not actually be 0, but that doesn't cover 0 cast to any other type.
But, the standard then continues:
If a null pointer constant is converted to a pointer type, the resulting pointer, called a null pointer, ...
which covers (int *)0 as null pointer since cast is an explicit conversion (C11, 6.3) which is listed under conversion methods.
However, what still makes me wonder is the following phrase
... or such an expression cast to type void * ...
With the above semantics, this phrase seems completely useless. The question is, is this phrase completely useless? If not, what implications does it have? Consequently, is (int *)0 the null pointer or not?
Another question that can help the discussion is the following. Is (long long)123 considered "123 converted to long long", or "123 with type long long". In other words, is there any conversion in (long long)123? If there is none, then the second quote above doesn't cover (int *)0 as a null pointer.
Short answer:
In both C and C++, (int *)0 is a constant expression whose value is a null pointer. It is not, however, a null pointer constant. The only observable difference between a constant-expression-whose-value-is-a-null-pointer and a null-pointer-constant, that I know of, is that a null-pointer-constant can be assigned to an lvalue of any pointer type, but a constant-expression-whose-value-is-a-null-pointer has a specific pointer type and can only be assigned to an lvalue with a compatible type. In C, but not C++, (void *)0 is also a null pointer constant; this is a special case for void * consistent with the general C-but-not-C++ rule that void * is assignment compatible with any other pointer-to-object type.
For example:
long *a = 0; // ok, 0 is a null pointer constant
long *b = (long *)0; // ok, (long *)0 is a null pointer with appropriate type
long *c = (void *)0; // ok in C, invalid conversion in C++
long *d = (int *)0; // invalid conversion in both C and C++
And here's a case where the difference between the null pointer constant (void *)0 and a constant-expression-whose-value-is-a-null-pointer with type void * is visible, even in C:
typedef void (*fp)(void); // any pointer-to-function type will show this effect
fp a = 0; // ok, null pointer constant
fp b = (void *)0; // ok in C, invalid conversion in C++
fp c = (void *)(void *)0; // invalid conversion in both C and C++
Also, it's moot nowadays, but since you brought it up: No matter what the bit representation of long *'s null pointer is, all of these assertions behave as indicated by the comments:
// 'x' is initialized to a null pointer
long *x = 0;
// 'y' is initialized to all-bits-zero, which may or may not be the
// representation of a null pointer; moreover, it might be a "trap
// representation", UB even to access
long *y;
memset(&y, 0, sizeof y);
assert (x == 0); // must succeed
assert (x == (long *)0); // must succeed
assert (x == (void *)0); // must succeed in C, unspecified behavior in C++
assert (x == (int *)0); // invalid comparison in both C and C++
assert (memcmp(&x, &y, sizeof y) == 0); // unspecified
assert (y == 0); // UNDEFINED BEHAVIOR: y may be a trap representation
assert (y == x); // UNDEFINED BEHAVIOR: y may be a trap representation
"Unspecified" comparisons do not provoke undefined behavior, but the standard doesn't say whether they evaluate true or false, and the implementation is not required to document which of the two it is, or even to pick one and stick to it. It would be perfectly valid for the above memcmp to alternate between returning 0 and 1 if you called it many times.
Long answer with standard quotes:
To understand what a null pointer constant is, you first have to understand what an integer constant expression is, and that's pretty hairy -- a complete understanding requires you to read sections 6.5 and 6.6 of C99 in detail. This is my summary:
A constant expression is any C expression which the compiler can evaluate to a constant without knowing the value of any object (const or otherwise; however, enum values are fair game), and which has no side effects. (This is a drastic simplification of roughly 25 pages of standardese and may not be exact.)
Integer constant expressions are a restricted subset of constant expressions, conveniently defined in a single paragraph, C99 6.6p6 and its footnote:
An integer constant expression96 shall have integer type and shall only have operands that are integer constants, enumeration constants, character constants, sizeof expressions whose results are integer constants, and floating constants that are the immediate operands of casts. Cast operators in an integer constant expression shall only convert arithmetic types to integer types, except as part of an operand to the sizeof
operator.
96 An integer constant expression is used to specify the size of a bit-field member of a structure, the value of an enumeration constant, the size of an array, or the value of a case constant. Further constraints that apply to the integer constant expressions used in [#if] are discussed in 6.10.1.
For purpose of this discussion, the important bit is
Cast operators ... shall only convert arithmetic types to integer types
which means that (int *)0 is not an integer constant expression, although it is a constant expression.
The C++98 definition appears to be more or less equivalent, modulo C++ features and deviations from C. For instance, the stronger separation of character and boolean types from integer types in C++ means that the C++ standard speaks of "integral constant expressions" rather than "integer constant expressions", and then sometimes requires not just an integral constant expression, but an integral constant expression of integer type, excluding char, wchar_t, and bool (and maybe also signed char and unsigned char? it's not clear to me from the text).
Now, the C99 definition of null pointer constant is what this question is all about, so I'll repeat it: 6.3.2.3p3 says
An integer constant expression with the value 0, or such an expression cast to type
void *, is called a null pointer constant. If a null pointer constant is converted to a
pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal to a pointer to any object or function.
Standardese is very, very literal. Those two sentences mean exactly the same thing as:
An integer constant expression with the value 0 is called a null pointer constant.
An integer constant expression with the value 0, cast to type void *, is also a null pointer constant.
When any null pointer constant is converted to a pointer type, the resulting pointer is called a null pointer and is guaranteed to compare unequal ...
(Italics - definition of term. Boldface - my emphasis.) So what that means is, in C, (long *)0 and (long *)(void *)0 are two ways of writing exactly the same thing, namely the null pointer with type long *.
C++ is different. The equivalent text is C++98 4.10 [conv.ptr]:
A null pointer constant is an integral constant expression (5.19) rvalue of integer type that evaluates to zero.
That's all. "Integral constant expression rvalue of integer type" is very nearly the same thing as C99's "integer constant expression", but there are a few things that qualify in C but not C++: for instance, in C the character literal '\x00' is an integer constant expression, and therefore a null pointer constant, but in C++ it is not an integral constant expression of integer type, so it is not a null pointer constant either.
More to the point, though, C++ doesn't have the "or such an expression cast to void *" clause. That means that ((void *)0) is not a null pointer constant in C++. It is still a null pointer, but it is not assignment compatible with any other pointer type. This is consistent with C++'s generally pickier type system.
C++11 (but not, AFAIK, C11) revised the concept of "null pointer", adding a special type for them (nullptr_t) and a new keyword which evaluates to a null pointer constant (nullptr). I do not fully understand the changes and am not going to try to explain them, but I am pretty sure that a bare 0 is still a valid null pointer constant in C++11.
Evaluating the expression (int*)0 yields a null pointer of type int*.
(int*)0 is not a null pointer constant.
A null pointer constant is a particular kind of expression that may appear in C source code. A null pointer is a value that may occur in a running program.
C and C++ (being two distinct languages) have slightly different rules in this area. C++ doesn't have the "or such an expression cast to type void*" wording. But I don't think that affects the answer to your question.
As for your question about (long long)123, I'm not sure how it's related, but the expression 123 is of type int, and the cast specifies a conversion from int to long long.
I think the core confusion is an assumption that the cast in (int*)0 does not specify a conversion, since 0 is already a null pointer constant. But a null pointer constant is not necessarily an expression of pointer type. In particular, the expression 0 is both a null pointer constant and an expression of type int; it is not of any pointer type. The term null pointer constant needs to be thought of as a single concept, not a phrase whose meaning depends on the individual words that make it up.
NULL appears to be zero in my GCC test programs, but wikipedia says that NULL is only required to point to unaddressable memory.
Do any compilers make NULL non-zero? I'm curious whether if (ptr == NULL) is better practice than if (!ptr).
NULL is guaranteed to be zero, perhaps casted to (void *)1.
C99, §6.3.2.3, ¶3
An integer constant expression with the value 0, or such an expression cast to type
void *, is called a null pointer constant.(55) If a null pointer constant is converted to a
pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal
to a pointer to any object or function.
And note 55 says:
55) The macro NULL is defined in <stddef.h> (and other headers) as a null pointer constant.
Notice that, because of how the rules for null pointers are formulated, the value you use to assign/compare null pointers is guaranteed to be zero, but the bit pattern actually stored inside the pointer can be any other thing (but AFAIK only few very esoteric platforms exploited this fact, and this should not be a problem anyway since to "see" the underlying bit pattern you should go into UB-land anyway).
So, as far as the standard is concerned, the two forms are equivalent (!ptr is equivalent to ptr==0 due to §6.5.3.3 ¶5, and ptr==0 is equivalent to ptr==NULL); if(!ptr) is also quite idiomatic.
That being said, I usually write explicitly if(ptr==NULL) instead of if(!ptr) to make it extra clear that I'm checking a pointer for nullity instead of some boolean value.
Notice that in C++ the void * cast cannot be present due to the stricter implicit casting rules that would make the usage of such NULL cumbersome (you would have to explicitly convert it to the compared pointer's type every time).
From the language standard:
6.3.2.3 Pointers
...
3 An integer constant expression with the value 0, or such an expression cast to type
void *, is called a null pointer constant.55) If a null pointer constant is converted to a
pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal
to a pointer to any object or function.
...
55) The macro NULL is defined in <stddef.h> (and other headers) as a null pointer constant; see 7.17.
Given that language, the macro NULL should evaluate to a zero-valued expression (either an undecorated literal 0, an expression like (void *) 0, or another macro or expression that ultimately evaluates to 0). The expressions ptr == NULL and !ptr should be equivalent. The second form tends to be more idiomatic C code.
Note that the null pointer value doesn't have to be 0. The underlying implementation may use any value it wants to represent a null pointer. As far as your source code is concerned, however, a zero-valued pointer expression represents a null pointer.
In practice is the same, but NULL is different to zero. Since zero means there's a value and NULL means there isn't any. So, theoretically they are different, NULL having a different meaning and in some cases that difference should be of some use.
in practice no, !ptr is correct
In C, what is the difference between a NULL pointer and a pointer that points to 0?
The ISO/IEC 9899:TC2 states in 6.3.2.3 Pointers
3 An integer constant expression with the value 0, or such an expression
cast to type void *, is called a null pointer constant.55) If a null
pointer constant is converted to a pointer type, the resulting
pointer, called a null pointer, is guaranteed to compare unequal to a
pointer to any object or function
The macro NULL expands to an implementation-defined null pointer constant.
Any two null pointers shall compare equal.
Yes there is. The standard dictates that NULL always points to invalid memory. But it does not state that the integer representation of the pointer must be 0. I've never come across an implementation for which NULL was other than 0, but that is not mandated by the standard.
Note that assigning the literal 0 to a pointer does not mean that the pointer assumes the integer representation of 0. It means that the special null pointer value is assigned to the pointer variable.
Evaluating the literal 0 in a pointer context is identical to NULL. Whatever bit pattern the compiler uses to represent a NULL pointer is hidden.
The old comp.lang.c FAQ has a big section on the null pointer and it's worth a read.
comp.lang.c null pointers
The idea is that a NULL pointer should somehow represent a memory area that is invalid.
So since in the lower memory segments the OS code is mapped, the value of 0 has been used (to represent the NULL pointer) since this area in memory does not belong to the user's program but is mapped to the OS code.