strcmp behaviour in 32-bit and 64-bit systems - c

The following piece of code behaves differently in 32-bit and 64-bit operating systems.
char *cat = "v,a";
if (strcmp(cat, ",") == 1)
...
The above condition is true in 32-bit but false in 64-bit. I wonder why this is different?
Both 32-bit and 64-bit OS are Linux (Fedora).

The strcmp() function is only defined to return a negative value if argument 1 precedes argument 2, zero if they're identical, or a positive value if argument 1 follows argument 2.
There is no guarantee of any sort that the value returned will be +1 or -1 at any time. Any equality test based on that assumption is faulty. It is conceivable that the 32-bit and 64-bit versions of strcmp() return different numbers for a given string comparison, but any test that looks for +1 from strcmp() is inherently flawed.
Your comparison code should be one of:
if (strcmp(cat, ",") > 0) // cat > ","
if (strcmp(cat, ",") == 0) // cat == ","
if (strcmp(cat, ",") >= 0) // cat >= ","
if (strcmp(cat, ",") <= 0) // cat <= ","
if (strcmp(cat, ",") < 0) // cat < ","
if (strcmp(cat, ",") != 0) // cat != ","
Note the common theme — all the tests compare with 0. You'll also see people write:
if (strcmp(cat, ",")) // != 0
if (!strcmp(cat, ",")) // == 0
Personally, I prefer the explicit comparisons with zero; I mentally translate the shorthands into the appropriate longhand (and resent being made to do so).
Note that the specification of strcmp() says:
ISO/IEC 9899:2011 §7.24.4.2 The strcmp function
¶3 The strcmp function returns an integer greater than, equal to, or less than zero,
accordingly as the string pointed to by s1 is greater than, equal to, or less than the string pointed to by s2.
It says nothing about +1 or -1; you cannot rely on the magnitude of the result, only on its signedness (or that it is zero when the strings are equal).

Standard functions doesn't exhibit different behaviour based on the "bittedness" of your OS unless you're doing something silly like, for example, not including the relevant header file. They are required to exhibit exactly the behaviour specified in the standard, unless you violate the rules. Otherwise, your compiler, while close, will not be a C compiler.
However, as per the standard, the return value from strcmp() is either zero, positive or negative, it's not guaranteed to be +/-1 when non-zero.
Your expression would be better written as:
strcmp (cat, ",") > 0
The faultiness of using strcmp (cat, ",") == 1 has nothing to do with whether your OS is 32 or 64 bits, and everything to do with the fact you've misunderstood the return value. From the ISO C11 standard (my bold):
The strcmp function returns an integer greater than, equal to, or less than zero, accordingly as the string pointed to by s1 is greater than, equal to, or less than the string pointed to by s2.

The semantics guaranteed by strcmp() are well explained above in Jonathan's answer.
Coming back to your original question i.e.
Q. Why strcmp() behaviour differs in 32-bit and 64-bit systems?
Answer: strcmp() is implemented in glibc, wherein there exist different implementations for various architectures, all highly optimised for the corresponding architecture.
strcmp() on x86
strcmp() on x86-64
As the spec simply defines that the the return value is one of 3 possibilities (-ve, 0, +ve), the various implementations are free to return any value as long as the sign indicates the result appropriately.
On certain architectures (in this case x86), it is faster to simply compare each byte without storing the result. Hence its quicker to simply return -/+1 on a mismatch.
(Note that one could use subb instead of cmpb on x86 to obtain the difference in magnitude of the non-matching bytes. But this would require 1 additional clock cycle per byte. This would mean an addition 3% increase in total time taken as each complete iteration runs in less than 30 clock cycles.)
On other architectures (in this case x86-64), the difference between the byte values of the corresponding characters is already available as a by-product of the comparision. Hence it faster to simply return it rather than test them again and return -/+1.
Both are perfectly valid output as the strcmp() function is ONLY guaranteed to return the result using the proper sign and the magnitude is architecture/implementation specific.

Related

What exactly is strcmp(String comparison) doing?

My following code for testing strcmp is as follows:
char s1[10] = "racecar";
char *s2 = "raceCar"; //yes, a capital 'C'
int diff;
diff = strcmp(s1,s2);
printf(" %d\n", diff);
So I am confused on why the output is 32. What exactly is it comparing to get that result? I appreciate your time and help.
Whatever it wants. In this case, it looks like the value you're getting is 'c' - 'C' (the difference between the two characters at the first point where the strings differ), which is equal to 32 on many systems, but you shouldn't by any means count on that. The only thing that you can count on is that the return will be 0 if the two strings are equal, negative if s1 comes before s2, and positive if s1 comes after s2.
The man pages states that the output will be greater than 0 or less than 0 if the strings are not the same. It doesn't say anything else regarding the exact value (if not 0).
That being said, the ASCII codes for c and C differ by 32. That's probably where the result is coming from. You can't depend on this behavior being identical in any two given implementations however.
It is not specified. According to the standard:
7.24.4.2 The strcmp function
#include <string.h>
int strcmp(const char *s1, const char *s2);
Description
The strcmp function compares the string pointed to by s1 to the string pointed to by
s2.
Returns
The strcmp function returns an integer greater than, equal to, or less than zero,
accordingly as the string pointed to by s1 is greater than, equal to, or less than the string pointed to by s2.
According to the C standard (N1570 7.24.4.2):
The strcmp function returns an integer greater than, equal to,
or less than zero, accordingly as the string pointed to by s1 is
greater than, equal to, or less than the string pointed to by
s2.
It says nothing about which positive or negative value it will return if the strings are unequal, and portable code should only check whether the result is less than, equal to, or greater than zero.
Having said that, a straightforward implementation of strcmp would likely return the numeric difference in the values of the first characters that don't match. In your case, the first non-matching characters are 'c' and 'C', which happen to differ by 32 in ASCII.
Don't count on this.
"strcmp" compares strings and when it reaches a different character, it will return the difference between them.
In your case, it reaches 'c' in your first string, and 'C' in your second string. 'c' in hex is 0x63 while 'C' is 0x43. Subtract and you get 0x20, which is 32 in decimal.
We use strcmp to check if strings are equal if the function returns 0.
strcmp compares the strings character by character until it reaches characters that don't match or the terminating null-character.
so the strcmp function sees that c (which is 99 in ASCII) is greater than C (which is 67 in ascii), so it returns a positive integer. Whatever positive integer it returns is I think usually defined by your system or whatever version of c you are compiling.

Is the result of strcmp the same on all machines and compilers?

Code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
int n = strcmp("hello","help");
printf("%d\n", n ) ;
return 0;
}
Result:
-1
Does the value for this program have to be the same on all machines or different compilers?
In other words can this return value take on different values for the same program when run on different compilers or different machines?
It does not have to return -1 but it does have to return a value less than zero(if we assume an ASCII character set), the C99 draft standard in section 7.21.4.2 The strcmp function says:
The strcmp function returns an integer greater than, equal to, or less than zero,
accordingly as the string pointed to by s1 is greater than, equal to, or less than the string pointed to by s2.
and I can get clang to return either -1(live example with -O3) or -4(live example with -O0) depending on the optimization level.
With -O3 or even -O1 it looks like clang is not emitting a call to strcmp at all will just do a:
movl $-1, %esi
clang is probably using builtin functions to optimize here similar to gcc's builtin and in fact I can only get gcc to emit a call to strcmp in this case using -fno-builtin.
Important to note the standard does not guarantee the order of alphabetic characters, although it does say that numeric characters should be sequential from section 5.2.1 Character sets paragraph 3 says:
[...]In both the source and execution basic character sets, the
value of each character after 0 in the above list of decimal digits shall be one greater than the value of the previous.[...]
So differing character sets can also lead to different results on different platforms, we can easily see that comparing ASCII and EBCDIC. In ASCII the capital letters come before the lower case but it is the opposite in EBCDIC.
Standard library only guarantees you, about 0 (compared correctly), lesser than 0(could be -1, -10) and greater than 0(could be 1, 10, 100).
So yes you may get different values.
The returned value has been defined by the C standard. So, all compilers and different machines should return the same value on same strings.

When will strcmp not return -1, 0 or 1?

From the man page:
The strcmp() and strncmp() functions return an integer less than, equal
to, or greater than zero if s1 (or the first n bytes thereof) is found,
respectively, to be less than, to match, or be greater than s2.
Example code in C (prints -15 on my machine, swapping test1 and test2 inverts the value):
#include <stdio.h>
#include <string.h>
int main() {
char* test1 = "hello";
char* test2 = "world";
printf("%d\n", strcmp(test1, test2));
}
I found this code (taken from this question) that relies on the values of strcmp being something other than -1, 0 and 1 (it uses the return value in qsort). To me, this is terrible style and depends on undocumented features.
I guess I have two, related questions:
Is there something in the C standard that defines what the return values are besides less than, greater than, or equal to zero? If not, what does the standard implementation do?
Is the return value consistent across the Linux, Windows and the BSDs?
Edit:
After leaving my computer for 5 minutes, I realized that there is in fact no error with the code in question. I struck out the parts that I figured out before reading the comments/answers, but I left them there to keep the comments relevant. I think this is still an interesting question and may cause hiccups for programmers used to other languages that always return -1, 0 or 1 (e.g. Python seems to do this, but it's not documented that way).
FWIW, I think that relying on something other than the documented behavior is bad style.
Is there something in the C standard that defines what the return values are besides less than, greater than, or equal to zero?
No. The tightest constraint is that it should be zero, less than zero or more than zero, as specified in the documentation of this particular function.
If not, what does the standard implementation do?
There's no such thing as "the standard implementation". Even if there was, it would probably just
return zero, less than zero or more than zero;
:-)
Is the return value consistent across the Linux, Windows and the BSDs?
I can confirm that it's consistent across Linux and OS X as of 10.7.4 (specifically, it's -1, 0 or +1). I have no idea about Windows, but I bet Microsoft guys use -2 and +3 just to break code :P
Also, let me also point out that you have completely misunderstood what the code does.
I found this code (taken from this question) that relies on the values of strcmp being something other than -1, 0 and 1 (it uses the return value in qsort). To me, this is terrible style and depends on undocumented features.
No, it actually doesn't. The C standard library is designed with consistency and ease of use in mind. That is, what qsort() requires is that its comparator function returns a negative or a positive number or zero - exactly what strcmp() is guaranteed to do. So this is not "terrible style", it's perfectly standards-conformant code which does not depend upon undocumented features.
In the C99 standard, §7.21.4.2 The strcmp function:
The strcmp function returns an integer greater than, equal to, or less than zero,
accordingly as the string pointed to by s1 is greater than, equal to, or less than the string pointed to by s2.
Emphasis added.
It means the standard doesn't guarantee about the -1, 0 or 1; it may vary according to operating systems.
The value you are getting is the difference between w and h which is 15.
In your case hello and world so 'h'-'w' = -15 < 0 and that's why strcmp returns -15.
• Is there something in the C standard that defines what the return values are besides less than, greater than, or equal to zero? If not, what does the standard implementation do?
No, as you mentioned yourself the man page says less than, equal to, or greater than zero and that's what the standard says as well.
• Is the return value consistent across the Linux, Windows and the BSDs?
No.
On Linux (OpenSuSE 12.1, kernel 3.1) with gcc, I get -15/15 depending on if test1 or test2 is first. On Windows 7 (VS 2010) I get -1/1.
Based on the loose definition of strcmp(), both are fine.
...that relies on the values of strcmp being something other than -1, 0 and 1 (it uses the return value in qsort).
An interesting side note for you... if you take a look at the qsort() man page, the example there is pretty much the same as the Bell code you posted using strcmp(). The reason being the comparator function that qsort() requires is actually a great fit for the return from strcmp():
The comparison function must return an integer less than, equal to, or
greater than zero if the first argument is considered to be
respectively less than, equal to, or greater than the second.
In reality, the return value of strcmp is likely to be the difference between the values of the bytes at the first position that differed, simply because returning this difference is a lot more efficient than doing an additional conditional branch to convert it to -1 or 1. Unfortunately, some broken software has been known to assume the result fits in 8 bits, leading to serious vulnerabilities. In short, you should never use anything but the sign of the result.
For details on the issues, read the article I linked above:
https://communities.coverity.com/blogs/security/2012/07/19/more-defects-like-the-mysql-memcmp-vulnerability
In this page:
The strcmp() function compares the string pointed to by s1 to the string pointed to by s2.
The sign of a non-zero return value is determined by the sign of the difference between the values of the first pair of bytes (both interpreted as type unsigned char) that differ in the strings being compared.
Here is an implementation of strcmp in FreeBSD.
#include <string.h>
/*
* Compare strings.
*/
int
strcmp(s1, s2)
register const char *s1, *s2;
{
while (*s1 == *s2++)
if (*s1++ == 0)
return (0);
return (*(const unsigned char *)s1 - *(const unsigned char *)(s2 - 1));
}
From the manual page:
RETURN VALUE
The strcmp() and strncmp() functions return an integer less than, equal to, or greater than zero if s1 (or the first n bytes
thereof) is found, respectively, to
be less than, to match, or be greater than s2.
It only specifies that it is greater or less than 0, doesn't say anything about specific values, those are implementation specific i suppose.
CONFORMING TO
SVr4, 4.3BSD, C89, C99.
This says in which standards it is included. The function must exist and behave as specified, but the specification doesn't say anything about the actual returned values, so you can't rely on them.
There's nothing in the C standard that talks about the value returned by strcmp() (that is, other than the sign of that value):
7.21.4.2 The strcmp function
Synopsis
#include <string.h>
int strcmp(const char *s1, const char *s2);
Description
The strcmp function compares the string pointed to by s1
to the string pointed to by s2.
Returns
The strcmp function returns an integer greater than, equal
to, or less than zero, accordingly as the string pointed to by s1 is
greater than, equal to, or less than the string pointed to by s2.
It is therefore pretty clear that using anything other than the sign of the returned value is a poor practice.

strncmp C Exercise

I'm trying to do exercise 5-4 in the K&R C book. I have written the methods for strncpy and strncat, but I'm having some trouble understanding exactly what to return for the strncmp part of the exercise.
The definition of strncmp (from Appendix B in K&R book) is:
compare at most n characters of string s to string t; return <0 if s<t, 0 if s==t, or >0 if s>t
Lets say I have 3 strings:
char s[128] = "abc"
char t[128] = "abcdefghijk"
char u[128] = "hello"
And I want to compare them using the strncmp function I have to write. I know that
strncmp(s, t, 3)
will return 0 ,because abc == abc. Where I'm confused is the other comparisons. For example
strncmp(s, t, 5) and
strncmp(s, u, 4)
The first matches up the 3th position and then after that they no longer match and the second example doesn't match at all.
I really just want know what those 2 other comparisons return and why so that I can write my version of strncmp and finish the exercise.
Both return a negative number (it just compares using character order). I just did a quick test and on my machine it's returning the difference of the last-compared characters. So:
strncmp(s, t, 5) = -100 // '\0' - 'd'
strncmp(s, u, 4) = -7 // 'a' - 'h'
Is that what you're looking for?
The characters in the first non-matching positions are cast to unsigned char and then compared numerically - if that character in s1 is less than the corresponding character in s2, then a negative number is returned; if it's greater, a positive number is returned.
The contract for strncmp is to return an integral value whose sign indicates the result of the comparison:
a negative value indicates that the 1st operand compares as being "less than" the 2nd operand,
a positive, non-zero value indicates that the 1st operand compares as being "greater than" than the 2nd operand, and
0 indicates that the two operands compare as being "equal to" each other.
The reason it's defined that way, rather than, say, "return -1 for "less than", 0 for "equal to" and +1 for "greater than" is to not constrain the implementation.
The value returned for a particular C runtime library is dependent upon how the function is implemented. The Posix specification (IEEE 1003.1) for strncmp() (which tracks the C Standard) says:
The strncmp() function shall compare not more than n bytes (bytes that follow a null
byte are not compared) from the array pointed to by s1 to the array pointed to by s2.
The sign of a non-zero return value is determined by the sign of the difference
between the values of the first pair of bytes (both interpreted as type unsigned
char) that differ in the strings being compared.
That should be about all you need to know to implement it. You should note, though that:
strncmp() is not "safe", in the sense that it is subject to buffer overflows. A proper implementation will merrily compare characters until it encounters an ASCII NUL, hits the maximum length, or tries to access protected memory.
The specification says that the sign of the return value is based on the delta between the 1st pair of characters that differ; no particular return value is mandated.
Good luck.
it is lexicographic order, strings are compared in alphabetical order from left to right.
So abc < abcdefghijk < hello
strncmp(s, t, 5) = -1
strncmp(s, t, 5) = -1

What is the difference between NULL, '\0' and 0?

In C, there appear to be differences between various values of zero -- NULL, NUL and 0.
I know that the ASCII character '0' evaluates to 48 or 0x30.
The NULL pointer is usually defined as:
#define NULL 0
Or
#define NULL (void *)0
In addition, there is the NUL character '\0' which seems to evaluate to 0 as well.
Are there times when these three values can not be equal?
Is this also true on 64 bit systems?
Note: This answer applies to the C language, not C++.
Null Pointers
The integer constant literal 0 has different meanings depending upon the context in which it's used. In all cases, it is still an integer constant with the value 0, it is just described in different ways.
If a pointer is being compared to the constant literal 0, then this is a check to see if the pointer is a null pointer. This 0 is then referred to as a null pointer constant. The C standard defines that 0 cast to the type void * is both a null pointer and a null pointer constant.
Additionally, to help readability, the macro NULL is provided in the header file stddef.h. Depending upon your compiler it might be possible to #undef NULL and redefine it to something wacky.
Therefore, here are some valid ways to check for a null pointer:
if (pointer == NULL)
NULL is defined to compare equal to a null pointer. It is implementation defined what the actual definition of NULL is, as long as it is a valid null pointer constant.
if (pointer == 0)
0 is another representation of the null pointer constant.
if (!pointer)
This if statement implicitly checks "is not 0", so we reverse that to mean "is 0".
The following are INVALID ways to check for a null pointer:
int mynull = 0;
<some code>
if (pointer == mynull)
To the compiler this is not a check for a null pointer, but an equality check on two variables. This might work if mynull never changes in the code and the compiler optimizations constant fold the 0 into the if statement, but this is not guaranteed and the compiler has to produce at least one diagnostic message (warning or error) according to the C Standard.
Note that the value of a null pointer in the C language does not matter on the underlying architecture. If the underlying architecture has a null pointer value defined as address 0xDEADBEEF, then it is up to the compiler to sort this mess out.
As such, even on this funny architecture, the following ways are still valid ways to check for a null pointer:
if (!pointer)
if (pointer == NULL)
if (pointer == 0)
The following are INVALID ways to check for a null pointer:
#define MYNULL (void *) 0xDEADBEEF
if (pointer == MYNULL)
if (pointer == 0xDEADBEEF)
as these are seen by a compiler as normal comparisons.
Null Characters
'\0' is defined to be a null character - that is a character with all bits set to zero. '\0' is (like all character literals) an integer constant, in this case with the value zero. So '\0' is completely equivalent to an unadorned 0 integer constant - the only difference is in the intent that it conveys to a human reader ("I'm using this as a null character.").
'\0' has nothing to do with pointers. However, you may see something similar to this code:
if (!*char_pointer)
checks if the char pointer is pointing at a null character.
if (*char_pointer)
checks if the char pointer is pointing at a non-null character.
Don't get these confused with null pointers. Just because the bit representation is the same, and this allows for some convenient cross over cases, they are not really the same thing.
References
See Question 5.3 of the comp.lang.c FAQ for more.
See this pdf for the C standard. Check out sections 6.3.2.3 Pointers, paragraph 3.
It appears that a number of people misunderstand what the differences between NULL, '\0' and 0 are. So, to explain, and in attempt to avoid repeating things said earlier:
A constant expression of type int with the value 0, or an expression of this type, cast to type void * is a null pointer constant, which if converted to a pointer becomes a null pointer. It is guaranteed by the standard to compare unequal to any pointer to any object or function.
NULL is a macro, defined in as a null pointer constant.
\0 is a construction used to represent the null character, used to terminate a string.
A null character is a byte which has all its bits set to 0.
All three define the meaning of zero in different context.
pointer context - NULL is used and means the value of the pointer is 0, independent of whether it is 32bit or 64bit (one case 4 bytes the other 8 bytes of zeroes).
string context - the character representing the digit zero has a hex value of 0x30, whereas the NUL character has hex value of 0x00 (used for terminating strings).
These three are always different when you look at the memory:
NULL - 0x00000000 or 0x00000000'00000000 (32 vs 64 bit)
NUL - 0x00 or 0x0000 (ascii vs 2byte unicode)
'0' - 0x20
I hope this clarifies it.
If NULL and 0 are equivalent as null pointer constants, which should I use? in the C FAQ list addresses this issue as well:
C programmers must understand that
NULL and 0 are interchangeable in
pointer contexts, and that an uncast 0
is perfectly acceptable. Any usage of
NULL (as opposed to 0) should be
considered a gentle reminder that a
pointer is involved; programmers
should not depend on it (either for
their own understanding or the
compiler's) for distinguishing pointer
0's from integer 0's.
It is only in pointer contexts that
NULL and 0 are equivalent. NULL should
not be used when another kind of 0 is
required, even though it might work,
because doing so sends the wrong
stylistic message. (Furthermore, ANSI
allows the definition of NULL to be
((void *)0), which will not work at
all in non-pointer contexts.) In
particular, do not use NULL when the
ASCII null character (NUL) is desired.
Provide your own definition
#define NUL '\0'
if you must.
What is the difference between NULL, ‘\0’ and 0
"null character (NUL)" is easiest to rule out. '\0' is a character literal.
In C, it is implemented as int, so, it's the same as 0, which is of INT_TYPE_SIZE. In C++, character literal is implemented as char, which is 1 byte. This is normally different from NULL or 0.
Next, NULL is a pointer value that specifies that a variable does not point to any address space. Set aside the fact that it is usually implemented as zeros, it must be able to express the full address space of the architecture. Thus, on a 32-bit architecture NULL (likely) is 4-byte and on 64-bit architecture 8-byte. This is up to the implementation of C.
Finally, the literal 0 is of type int, which is of size INT_TYPE_SIZE. The default value of INT_TYPE_SIZE could be different depending on architecture.
Apple wrote:
The 64-bit data model used by Mac OS X is known as "LP64". This is the common data model used by other 64-bit UNIX systems from Sun and SGI as well as 64-bit Linux. The LP64 data model defines the primitive types as follows:
ints are 32-bit
longs are 64-bit
long-longs are also 64-bit
pointers are 64-bit
Wikipedia 64-bit:
Microsoft's VC++ compiler uses the LLP64 model.
64-bit data models
Data model short int long long long pointers Sample operating systems
LLP64 16 32 32 64 64 Microsoft Win64 (X64/IA64)
LP64 16 32 64 64 64 Most Unix and Unix-like systems (Solaris, Linux, etc.)
ILP64 16 64 64 64 64 HAL
SILP64 64 64 64 64 64 ?
Edit:
Added more on the character literal.
#include <stdio.h>
int main(void) {
printf("%d", sizeof('\0'));
return 0;
}
The above code returns 4 on gcc and 1 on g++.
One good piece which helps me when starting with C (taken from the Expert C Programming by Linden)
The One 'l' nul and the Two 'l' null
Memorize this little rhyme to recall the correct terminology for pointers and ASCII zero:
The one "l" NUL ends an ASCII string,
The two "l" NULL points to no thing.
Apologies to Ogden Nash, but the three "l" nulll means check your spelling.
The ASCII character with the bit pattern of zero is termed a "NUL".
The special pointer value that means the pointer points nowhere is "NULL".
The two terms are not interchangeable in meaning.
A one-L NUL, it ends a string.
A two-L NULL points to no thing.
And I will bet a golden bull
That there is no three-L NULLL.
How do you deal with NUL?
"NUL" is not 0, but refers to the ASCII NUL character. At least, that's how I've seen it used. The null pointer is often defined as 0, but this depends on the environment you are running in, and the specification of whatever operating system or language you are using.
In ANSI C, the null pointer is specified as the integer value 0. So any world where that's not true is not ANSI C compliant.
A byte with a value of 0x00 is, on the ASCII table, the special character called NUL or NULL. In C, since you shouldn't embed control characters in your source code, this is represented in C strings with an escaped 0, i.e., \0.
But a true NULL is not a value. It is the absence of a value. For a pointer, it means the pointer has nothing to point to. In a database, it means there is no value in a field (which is not the same thing as saying the field is blank, 0, or filled with spaces).
The actual value a given system or database file format uses to represent a NULL isn't necessarily 0x00.
NULL is not guaranteed to be 0 -- its exact value is architecture-dependent. Most major architectures define it to (void*)0.
'\0' will always equal 0, because that is how byte 0 is encoded in a character literal.
I don't remember whether C compilers are required to use ASCII -- if not, '0' might not always equal 48. Regardless, it's unlikely you'll ever encounter a system which uses an alternative character set like EBCDIC unless you're working on very obscure systems.
The sizes of the various types will differ on 64-bit systems, but the integer values will be the same.
Some commenters have expressed doubt that NULL be equal to 0, but not be zero. Here is an example program, along with expected output on such a system:
#include <stdio.h>
int main () {
size_t ii;
int *ptr = NULL;
unsigned long *null_value = (unsigned long *)&ptr;
if (NULL == 0) {
printf ("NULL == 0\n"); }
printf ("NULL = 0x");
for (ii = 0; ii < sizeof (ptr); ii++) {
printf ("%02X", null_value[ii]); }
printf ("\n");
return 0;
}
That program could print:
NULL == 0
NULL = 0x00000001
(void*) 0 is NULL, and '\0' represents the end of a string.

Resources