Difference between unsigned and signed int pointer - c

Is there anything such as an unsigned int* which is different from int*. I know that unsigned has a higher range of values. Still, can't int* even point to any unsigned int?

int * and unsigned int * are two different pointer types that are not compatible types. They are also pointers to incompatible types. For the definition of compatible types, please refer to § 6.2.7 in the C Standard (C11).
Being pointers to incompatible types means that for example that this:
unsigned int a = 42;
int *p = &a; // &a is of type unsigned int *
is not valid (the constraints of the assignment operator are violated).
Another difference between the two types is as for most other pointer types (although unlikely here) there is no guarantee from C they have the same size or the same representation.

Using an unsigned pointer to point to a signed version of the same type is defined by C Standard.
Therefore interpreting an int through an unsigned int pointer and vice-versa is valid.
ISO/IEC 9899:201x 6.5 Expressions, p7:
An object shall have its stored value accessed only by an lvalue expression that has one of
the following types: 88)
— a type that is the signed or unsigned type corresponding to the effective type of the
object,
— a type that is the signed or unsigned type corresponding to a qualified version of the
effective type of the object,
88) The intent of this list is to specify those circumstances in which an object may or may not be aliased.
Effective type is basically the type of the object:
The effective type of an object for an access to its stored value is the declared type of the
object, if any.
An issue has been raised about the interpretation of the above rule. The following is my additional rationale about it.
This text below is listed purely for semantic reasoning of the word: corresponding ,and not for the direct rules it specifies.
6.2.5 Types
p6: For each of the signed integer types, there is a corresponding (but different) unsigned
integer type (designated with the keyword unsigned) that uses the same amount of
storage (including sign information) and has the same alignment requirements.
p9: The range of nonnegative values of a signed integer type is a subrange of the
corresponding unsigned integer type, and the representation of the same value in each
type is the same.41)
p12: For each floating type there is a corresponding real type, which is always a real floating
type. For real floating types, it is the same type. For complex types, it is the type given
by deleting the keyword _Complex from the type name.
p27: Further, there is the _Atomic qualifier. The presence of the _Atomic qualifier
designates an atomic type. The size, representation, and alignment of an atomic type
need not be the same as those of the corresponding unqualified type
6.2.6.2 Integer types
p2: For signed integer types, the bits of the object representation shall be divided into three
groups: value bits, padding bits, and the sign bit. There need not be any padding bits;
signed char shall not have any padding bits. There shall be exactly one sign bit.
Each bit that is a value bit shall have the same value as the same bit in the object
representation of the corresponding unsigned type
p5: The values of any padding bits are unspecified.54)A valid (non-trap) object representation
of a signed integer type where the sign bit is zero is a valid object representation of the
corresponding unsigned type, and shall represent the same value.
(And many more examples with identical usage of the word corresponding )
As you can see in the above snippets, Standard uses the word corresponding to refer to different types or types with different specifiers and/or qualifiers. Therefore, as seen in the above examples Standard uses the word as would be used in this example: qualified type is corresponding to type.
It would be illogical to suddenly use the word corresponding for a different purpose: referring to completely identically qualified/specified types and even confuse the matters more by including the words signed and unsigned in the same sentence for no good reason.
The intention of the 6.5, p7 is: a type that is the signed or unsigned type either a signed or unsigned type corresponding to the effective type of the object that othervise matches( corresponds ) to the target type. So for example: effective type is: int, int or unsigned int correspond to that type.

unsigned int * and int * are different types. To convert one to the other you must use a cast.
If you read a value through a pointer then it attempts to interpret the bits stored at that memory location as if they were bits for the type being pointed to by the pointer you are reading through.
If the bits at that memory location were not written by a pointer of the same type you are reading through, then this is called aliasing.
The strict aliasing rule specifies which types may or may not be aliased; alasing between a type's signed and unsigned versions is always permitted.
However, if the bits are not a valid representation of a value in the type you are reading , then it causes undefined behaviour.
On modern systems there are no such "trap" representations so you have no issue. But let's say you were on a 1's complement system that trapped on negative zero:
unsigned int x = 0xFFFFFFFF;
int *y = (int *)&x;
printf("%d\n", y);
The attempt to read y could cause a hardware fault or any other behaviour.

The value of the pointer is the same, but they are different types. A difference will arise depending on the way you interpret the pointer - for eg: dereferencing.
unsigned int *u;
int *d;
unsigned int v = 2147483648; /* 2^31 */
u = &v;
d = (int*) &v;
printf("%u\n", *u);
printf("%d\n", *d);
will output:
2147483648
-2147483648
The difference in the output arises because in printf("%d\n", *d), d is dereferenced and printed as if it points to a signed int, except it isn't. So you have to keep a distinction between the 2 types of pointers in your code.

It can point as both has the same size. The problem is this will introduce a hard to find bug, because you'll interpret a signed value as an unsigned or vice-versa.

A pointer is a number that is a memory address. So pointers have to have enough precision to be able to address all of memory for the implementation.
Whether you reference a signed or unsigned int makes no difference in the internal structure of the pointer, because, in theory anyway, the int or unsigned int could be almost anywhere in memory. The datatype (unsigned) has to be declared to "help" the compiler decide correctness of the code.

Related

Signed integer type and its corresponding unsigned integer type

For each signed integer type the Standard guarantees existence of a corresponding unsigned integer type. 6.2.5 p6:
For each of the signed integer types, there is a corresponding (but
different) unsigned integer type (designated with the keyword unsigned
) that uses the same amount of storage (including sign information)
and has the same alignment requirements.
The phrase designated with the keyword unsigned got me confused and I consulted with earlier versions of the Standard to understand if it was presented there. C89/3.2.1.5 provides exactly the same wording:
For each of the signed integer types, there is a corresponding (but
different) unsigned integer type (designated with the keyword
unsigned) that uses the same amount of storage (including sign
information) and has the same alignment requirements.
Now consider uintptr_t and intptr_t; uintmax_t and intmax_t; etc... (which are optional, but in case an implementation defines those types).
QUESTION: According to the definition I cited above isn't uintptr_t a corresponding unsigned integer type for intptr_t and uintmax_t is a corresponding unsigned integer type for intmax_t?
I'm concerned about it because Usual arithmetic conversion uses the term 6.3.1.8 p1:
Otherwise, both operands are converted to the unsigned integer type
corresponding to the type of the operand with signed integer type
So I'm trying to understand the semantic of the usual arithmetic conversion applied to, say, uintptr_t and intptr_t.
According to 7.20(4) these are typedef names, not the underlying types.
For each type described herein that the implementation provides,261) <stdint.h> shall declare that typedef name and define the associated macros.
And 7.20.1(1) says:
When typedef names differing only in the absence or presence of the initial u are defined, they shall denote corresponding signed and unsigned types as described in 6.2.5;
So I believe these are required to follow the same default conversion rules as the basic integer types are.
The header in the Standard "7.20.1.5 Greatest-width integer types" and the description under the header where the two types are described in a pair assume that uintmax_t is defined as an unsigned type corresponding to the type intmax_t.
The intN_t and uintN_t fixed width types weren't introduced until C99, so that may be why the standard you're referencing lacks information about them.

Is there any difference between uintptr_t and unsigned int when unsigned int can hold any address?

Description of uintptr_t:
The following type designates an unsigned integer type with the
property that any valid pointer to void can be converted to this type,
then converted back to pointer to void, and the result will compare
equal to the original pointer:
uintptr_t
And since any pointer can be converted to void pointer and vice versa:
A pointer to void may be converted to or from a pointer to any object
type. A pointer to any object type may be converted to a pointer to
void and back again; the result shall compare equal to the original
pointer.
Any pointer can be converted to uintptr_t and vice versa, OK.
Now, description of integers and pointers:
[Integer -> Pointer]
An integer may be converted to any pointer type. Except as previously
specified, the result is implementation-defined, might not be
correctly aligned, might not point to an entity of the referenced
type, and might be a trap representation
[Pointer -> Integer]
Any pointer type may be converted to an integer type. Except as
previously specified, the result is implementation-defined. If the
result cannot be represented in the integer type, the behavior is
undefined. The result need not be in the range of values of any
integer type.
OK. Now, since in my system's ABI (Procedure call standard for ARM architecture) both unsigned int and pointers have same size and alignment, and my system uses plain 32bit continuous values starting from 0x0 for memory addresses, it seems that the implementation-defined gap in the conversion of
Integer -> Pointer and Pointer -> Integer
has been filled in my system, and I can safely convert pointers to unsigned integers, and there is no difference between converting a pointer into uintptr_t and converting a pointer to unsigned int in my system (both will yield same value). Am I right with my assumption? or there is something I'm missing?
Even given that unsigned int has enough bits to represent all addresses in the C implementation, the C standard does not guarantee this means that, given a void * pointer p, the expression (void *) (unsigned) p == p evaluates to true. Because the conversion from void * to an integer is implementation-defined, it might do more than simply reproduce the address as an unsigned value. It might include some bits describing the provenance of the address or a checksum, and unsigned might be insufficient to contain the necessary information to restore the original value.
Most implementations are likely to simply convert the address in the obvious way, reinterpreting the bits of the virtual memory adddress as an unsigned value, and no problems will arise. However, this is a feature of the implementation; it is not a requirement of the C standard.

How is "signed or unsigned type" meant in this C90 undefined behaviour definition?

In the ANSI C90 standard, section 6.3 has this to say about expressions:
An object shall have its stored value accessed only by an lvalue that has one of the following types: [...] a type that is the signed or unsigned type corresponding to a qualified version of the declared type of the object
And there is this instance of undefined behaviour in Annex G.2:
The behavior in the following circumstances is undefined: [...] An object has its stored value accessed by an lvalue that does not have one of the following types: the declared type of the object, a qualified version of the declared type of the object, the signed or unsigned type corresponding to the declared type of the object, the signed or unsigned type corresponding to a qualified version of the declared type of the object, an aggregate or union type that (recursively) includes one of the aforementioned types among its members, or a character type (6.3).
I find the wording of the emphasised parts ambiguous and am struggling to interpret it.
Does it mean "the signed type corresponding to the original type if it was signed, or the unsigned type corresponding to the original type if it was unsigned"; or "the type (whether signed or unsigned doesn't matter) corresponding to the original type"? That is, is:
signed int a = -10;
unsigned int b = *((unsigned int *) a);
...undefined?
If signed/unsigned doesn't matter, given that the standard makes the distinction between the three types char, signed char, and unsigned char, would accessing a char via signed char * or unsigned char * be defined?
It's saying that it's not undefined behavior to cast the value to a different signedness. If the object is declared signed int, you can access it using an unsigned int lvalue, and vice versa.
The case where the signedness is the same is already covered when it says "the declared type of the object", although this case could also be considered to say that.
In the case of char, both signed char and unsigned char are "the signed or unsigned type corresponding to" that type.
All together it's just saying that the signedness of the lvalue doesn't affect whether the access is well-defined.
Please note that Annex G is informative and the relevant part to quote is normative C90 6.3.
This refers to the precursor to the "strict aliasing rule" later introduced in C99. In C90, it was ambiguous what to do with objects that had no type, such as the data pointed at by the return from malloc.
It means that if the type of the object is either signed int or unsigned int, you can do a lvalue access either with signed int* or unsigned int*. These two pointer types are allowed to alias. So for example if you have a function like this:
void func (signed int* a, unsigned int* b)
then the compiler cannot assume that a and b point to different objects.
(Note that wildly exotic systems can in theory have padding bits and trap representations for signed types, so accessing an unsigned int through a signed int* could be UB for other reasons, in theory.)
The character types are a special case compared to other integer types indeed. But it doesn't matter here, since the rule have a special case too: "or a character type". char, unsigned char and signed char are all character types. This means that all pointer access to an lvalue using any of these 3 types are well-defined.
The lvalue type doesn't even need to be a character type! You can for example lvalue access an int through signed char* and it is well-defined, but not the other way around.
When C89 was written, unsigned types were a sufficiently new addition to the language that a lot of code used int in places where unsigned--once it existed--would have made more sense. The authors of the Standard wanted to ensure that functions that used the newer unsigned type would be able to exchange data with those that had been written to use int because unsigned hadn't existed yet.
The Standard is a bit ambiguous as to whether a type like unsigned* has a "corresponding signed type" int*, or unsigned** has a "corresponding unsigned type" int**, etc. Given the purpose of allowing interaction between code that predates unsigned types with code that uses them, making a function that's written to operate on sequences of int* unusable by clients that have sequence of unsigned* would be contrary to that purpose and also to the Committee's charter. Upholding the stated purpose wouldn't require that int** be universally usable to access objects of type unsigned*, but would require that compilers given constructs like:
unsigned *foo[10];
actOnIntPtrs((int**)foo, 10);
recognize that the called function might affect objects of type unsigned* stored in foo.

Can memcpy be used for type punning?

This is a quote from the C11 Standard:
6.5 Expressions
...
6 The effective type of an object for an access to its stored value is the declared type of the object, if any. If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value. If a value is copied into an object having no declared type using memcpy or memmove, or is copied as an array of character type, then the effective type of the modified object for that access and for subsequent accesses that do not modify the value is the effective type of the object from which the value is copied, if it has one. For all other accesses to an object having no declared type, the effective type of the object is simply the type of the lvalue used for the access.
7 An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
— a type compatible with the effective type of the object,
— a qualified version of a type compatible with the effective type of the object,
— a type that is the signed or unsigned type corresponding to the effective type of the object,
— a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
— an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
— a character type.
Does this imply that memcpy cannot be used for type punning this way:
double d = 1234.5678;
uint64_t bits;
memcpy(&bits, &d, sizeof bits);
printf("the representation of %g is %08"PRIX64"\n", d, bits);
Why would it not give the same output as:
union { double d; uint64_t i; } u;
u.d = 1234.5678;
printf("the representation of %g is %08"PRIX64"\n", d, u.i);
What if I use my version of memcpy using character types:
void *my_memcpy(void *dst, const void *src, size_t n) {
unsigned char *d = dst;
const unsigned char *s = src;
for (size_t i = 0; i < n; i++) { d[i] = s[i]; }
return dst;
}
EDIT: EOF commented that The part about memcpy() in paragraph 6 doesn't apply in this situation, since uint64_t bits has a declared type. I agree, but, unfortunately, this does not help answer the question whether memcpy can be used for type punning, it just makes paragraph 6 irrelevant to assess the validity of the above examples.
Here here is another attempt at type punning with memcpy that I believe would be covered by paragraph 6:
double d = 1234.5678;
void *p = malloc(sizeof(double));
if (p != NULL) {
uint64_t *pbits = memcpy(p, &d, sizeof(double));
uint64_t bits = *pbits;
printf("the representation of %g is %08"PRIX64"\n", d, bits);
}
Assuming sizeof(double) == sizeof(uint64_t), Does the above code have defined behavior under paragraph 6 and 7?
EDIT: Some answers point to the potential for undefined behavior coming from reading a trap representation. This is not relevant as the C Standard explicitly excludes this possibility:
7.20.1.1 Exact-width integer types
1 The typedef name intN_t designates a signed integer type with width N, no padding bits, and a two’s complement representation. Thus, int8_t denotes such a signed integer type with a width of exactly 8 bits.
2 The typedef name uintN_t designates an unsigned integer type with width N and no padding bits. Thus, uint24_t denotes such an unsigned integer type with a width of exactly 24 bits.
These types are optional. However, if an implementation provides integer types with widths of 8, 16, 32, or 64 bits, no padding bits, and (for the signed types) that have a two’s complement representation, it shall define the corresponding typedef names.
Type uint64_t has exactly 64 value bits and no padding bits, thus there cannot be any trap representations.
There are two cases to consider: memcpy()ing into an object that has a declared type, and memcpy()ing into an object that does not.
In the second case,
double d = 1234.5678;
void *p = malloc(sizeof(double));
assert(p);
uint64_t *pbits = memcpy(p, &d, sizeof(double));
uint64_t bits = *pbits;
printf("the representation of %g is %08"PRIX64"\n", d, bits);
The behavior is indeed undefined, since the effective type of the object pointed to by p will become double, and accessing an object of effective type double though an lvalue of type uint64_t is undefined.
On the other hand,
double d = 1234.5678;
uint64_t bits;
memcpy(&bits, &d, sizeof bits);
printf("the representation of %g is %08"PRIX64"\n", d, bits);
is not undefined. C11 draft standard n1570:
7.24.1 String function conventions
3 For all functions in this subclause, each character shall be interpreted as if it had the type
unsigned char (and therefore every possible object representation is
valid and has a different value).
And
6.5 Expressions
7 An object shall have its stored value accessed only by an lvalue expression that has one of the following types: 88)
— a type compatible with the effective type of the object,
— a qualified version of a type compatible with the effective type of the object,
— a type that is the signed or unsigned type corresponding to the effective type of the object,
— a type that is the signed or unsigned type corresponding to a qualified version of the
effective type of the object,
— an aggregate or union type that includes one of the aforementioned types
among its members (including, recursively, a member of a subaggregate or contained union), or
— a character type.
Footnote 88) The intent of this list is to specify those circumstances in which an object may or may not be aliased.
So the memcpy() itself is well-defined.
Since uint64_t bits has a declared type, it retains its type even though its object representation was copied from a double.
As chqrlie points out, uint64_t cannot have trap representations, so accessing bits after the memcpy() is not undefined, provided sizeof(uint64_t) == sizeof(double). However, the value of bits will be implementation-dependent (for example due to endianness).
Conclusion: memcpy() can be used for type-punning, provided that the destination of the memcpy() does have a declared type, i.e. is not allocated by [m/c/re]alloc() or equivalent.
You propose 3 ways which all have different problems with C standard.
standard library memcpy
double d = 1234.5678;
uint64_t bits;
memcpy(&bits, &d, sizeof bits);
printf("the representation of %g is %08"PRIX64"\n", d, bits);
The memcpy part is legal (provided in your implementation sizeof(double) == sizeof(uint64_t) which is not guaranteed per standard): you access two objects through char pointers.
But the printf line is not. The representation in bits is now a double. it might be a trap representation for an uint64_t, as defined in 6.2.6.1 General §5
Certain object representations need not represent a value of the object type. If the stored
value of an object has such a representation and is read by an lvalue expression that does
not have character type, the behavior is undefined. If such a representation is produced
by a side effect that modifies all or any part of the object by an lvalue expression that
does not have character type, the behavior is undefined. Such a representation is called
a trap representation.
And 6.2.6.2 Integer types says explicitely
For unsigned integer types other than unsigned char, the bits of the object
representation shall be divided into two groups: value bits and padding bits ... The values of any padding bits are unspecified.53
With note 53 saying:
Some combinations of padding bits might generate trap representations,
If you know that in your implementation there are no padding bits (still never seen one...) every representation is a valid value, and the print line becomes valid again. But it is only implementation dependant and can be undefined behaviour in the general case
union
union { double d; uint64_t i; } u;
u.d = 1234.5678;
printf("the representation of %g is %08"PRIX64"\n", d, u.i);
The members of the union do not share a common subsequence, and you are accessing a member which is not the last value written. Ok common implementation will give expected results but per standard it is not explicitely defined what should happen. A footnote in 6.5.2.3 Structure and union members §3 says that if leads to same problems as previous case:
If the member used to access the contents of a union object is not the same as the member last used to
store a value in the object, the appropriate part of the object representation of the value is reinterpreted
as an object representation in the new type as described in 6.2.6 (a process sometimes called "type
punning"). This might be a trap representation.
custom memcpy
Your implementation only does character accesses which is always allowed. It is exactly the same thing as the first case: implementation defined.
The only way that would be explicitely defined per standard would be to store the representation of the double in an char array of the correct size, and then display the bytes values of the char array:
double d = 1234.5678;
unsigned char bits[sizeof(d)];
memcpy(&bits, &d, sizeof(bits));
printf("the representation of %g is ", d);
for(int i=0; i<sizeof(bits); i++) {
printf("%02x", (unsigned int) bits[i]);
}
printf("\n");
And the result will only be useable if the implementation uses exactly 8 bits for a char. But it would be visible because it would display more than 8 hexa digits if one of the bytes had a value greater than 255.
All of the above is only valid because bits has a declared type. Please see #EOF's answer to understand why it would be different for an allocated object
I read paragraph 6 as saying that using the memcpy() function to copy a series of bytes from one memory location to another memory location can be used for type punning just as using a union with two different types can be used for type punning.
The first mention of using memcpy() indicates that if it copies the specified number of bytes and that those bytes will have the same type as the variable at the source destination when that variable (lvalue) was used to store the bytes there.
In other words if you have a variable double d; and you then assign a value to this variable (lvalue) the type of the data stored in that variable is type double. If you then use the memcpy() function to copy those bytes to another memory location, say a variable uint64_t bits; the type of those copied bytes is still double.
If you then access the copied bytes through the destination variable (lvalue), the uint64_t bits; in the example, then the type of that data is seen as the type of the lvalue used to retrieve the data bytes from that destination variable. So the bytes are interpreted (not converted but interpreted) as the destination variable type rather than the type of the source variable.
Accessing the bytes through a different type means the bytes are now interpreted as the new type even though the bytes have not actually changed in any way.
This is also the way a union works. A union does not do any kind of conversion. You store bytes into a union member which is of one type and then you pull the same bytes back out through a different union member. The bytes are the same however the interpretation of the bytes depends on the type of the union member that is used to access the memory area.
I have seen the memcpy() function used in older C source code to help divide up a struct into pieces by using struct member offset along with the memcpy() function to copy portions of the struct variable into other struct variables.
Because the type of the source location used in the memcpy() is the type of the bytes stored there the same kinds of problems that you can run into with the use of a union for punning also apply to using memcpy() in this way such as the Endianness of the data type.
The thing to remember is that whether using a union or using the memcpy() approach the type of the bytes copied are the type of the source variable and when you then access the data as another type, whether through a different member of a union or through the destination variable of the memcpy() the bytes are interpreted as the type of the destination lvalue. However the actual bytes are not changed.
CHANGED--SEE BELOW
While I have never observed a compiler to interpret a memcpy of non-overlapping source and destination as doing anything that would not be equivalent to reading all of the bytes of the source as a character type and then writing all of the bytes of the destination as a character type (meaning that if the destination had no declared type, it would be left with no effective type), the language of the Standard would allow obtuse compilers to make "optimizations" which--in those rare instances where a compiler would be able to identify and exploit them--would be more likely to break code which would otherwise work (and would be well-defined if the Standard were better written) than to actually improve efficiency.
As to whether that means that it's better to use memcpy or a manual byte-copy loop whose purpose is sufficiently well-disguised as to be unrecognizable as "copying an array of character type", I have no idea. I would posit that the sensible thing would be to shun anyone so obtuse as to suggest that a good compiler should generate bogus code absent such obfuscation, but since behavior that would have been considered obtuse in years past is presently fashionable, I have no idea whether memcpy will be the next victim in the race to break code which compilers had for decades treated as "well-defined".
UPDATE
GCC as of 6.2 will sometimes omit memmove operations in cases where it sees that the destination and source identify the same address, even if they are pointers of different types. If storage which had been written as the source type is later read as the destination type, gcc will assume that the latter read cannot identify the same storage as the earlier write. Such behavior on gcc's part is justifiable only because of the language in the Standard which allows the compiler to copy the Effective Type through the memmove. It's unclear whether that was an intentional interpretation of the rules regarding memcpy, however, given that gcc will also make a similar optimization in some cases where it is clearly not allowed by the Standard, e.g. when a union member of one type (e.g. 64-bit long) is copied to a temporary and from there to a member of a different type with the same representation (e.g. 64-bit long long). If gcc sees that the destination will be bit-for-bit identical to the temporary, it will omit the write, and consequently fail to notice that the effective type of the storage was changed.
It might give the same result, but the compiler does not need to guarantee it. So you simply cannot rely on it.

Cast pointer to larger int

I have a pointer. On a 32-bit system it's 32 bits. On a 64-bit system it's 64 bits.
I have a long long integer field used as an identifier, and sometimes I want to use the pointer value in there. (I never cast back to a pointer - once I've cast it to the integer field, I only ever compare it for equality).
On both 32-bit and 64-bit systems, it seems safe to do this. (On larger pointered systems not so). Is that true?
And then, is there a way to make GCC not give the following warning only when building on platforms where this is safe (which is, at the moment, all target platforms)?
error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
According to the standard, there is no guarantee that a pointer fits in an integer type. In practical, otherwise, on mostly personnal computers, there exists several memory models. You can see pointer and integer types have not always the same size (even on "conventional" computers).
You should rather use the optional types intptr_t and uintptr_t, from C99.
C11 (n1570), § 7.20.1.4
The following type designates a signed integer type with the property that any valid
pointer to void can be converted to this type, then converted back to pointer to void,
and the result will compare equal to the original pointer: intptr_t.
The following type designates an unsigned integer type with the property that any valid
pointer to void can be converted to this type, then converted back to pointer to void,
and the result will compare equal to the original pointer: uintptr_t.
Here is a small example:
#include <stdio.h>
#include <stdint.h>
int n = 42;
int *p = &n;
intptr_t i = (intptr_t)(void *)p;
int *q = (void *)i;
printf("%d\n", *q);
If you want to have an integer type that is guaranteed to be big enough to hold a pointer, consider intptr_t (signed) or uintptr_t. The standard guarantees that these datatypes are big enough to hold a pointer. Nothing else might be assumed, especially not that long int is long enough.

Resources