Pointer address span on various platforms - c

A common situation while coding in C is to be writing functions which return pointers. In case some error occurred within the written function during runtime, NULL may be returned to indicate an error. NULL is just the special memory address 0x0, which is never used for anything but to indicate the occurrence of a special condition.
My question is, are there any other special memory addresses which never will be used for userland application data?
The reason I want to know this is because it could effectively be used for error handling. Consider this:
#include <stdlib.h>
#include <stdio.h>
#define ERROR_NULL 0x0
#define ERROR_ZERO 0x1
int *example(int *a) {
if (*a < 0)
return ERROR_NULL;
if (*a == 0)
return (void *) ERROR_ZERO;
return a;
}
int main(int argc, char **argv) {
if (argc != 2) return -1;
int *result;
int a = atoi(argv[1]);
switch ((int) (result = example(&a))) {
case ERROR_NULL:
printf("Below zero!\n");
break;
case ERROR_ZERO:
printf("Is zero!\n");
break;
default:
printf("Is %d!\n", *result);
break;
}
return 0;
}
Knowing some special span of addresses which never will be used by userland applications could effectively be utilized for more efficient and cleaner condition handling. If you know about this, for which platforms does it apply?
I guess spans would be operating system specific. I'm mostly interested in Linux, but it would be nice to know for OS X, Windows, Android and other systems as well.

NULL is just the special memory address 0x0, which is never used for anything but to indicate the occurrence of a special condition.
That is not exactly right: there are computers where NULL pointer is not a zero internally (link).
are there any other special memory addresses which never will be used for userland applications?
Even NULL is not universal; there are no other universally unused memory addresses, which is not surprising, considering the number of different platforms programmable in C.
However, nobody stops you from defining your own special address in memory, setting it in a global variable, and treating it as your error indicator. This will work on all platforms, and would not require a special address location.
In the header:
extern void* ERROR_ADDRESS;
In a C file:
static int UNUSED;
void *ERROR_ADDRESS = &UNUSED;
At this point, ERROR_ADDRESS points to a globally unique location (i.e. the location of UNUSED, which is local to the compilation unit where it is defined), which you can use in testing pointers for equality.

The answer depends a lot on your C compiler and on your CPU and OS, where your compiled C program is going to run.
Your userland applications typically will never be able to access data or code through pointers pointing to the OS kernel data and code. And the OS usually does not return such pointers to applications.
Typically they will also never get a pointer pointing to a location that's not backed up by physical memory. You can only get such pointers through an error (a code bug) or by purposefully constructing such a pointer.
The C standard does not anyhow define what a valid range for pointers is and isn't. In C valid pointers are either NULL pointers or pointers to objects whose lifetime hasn't ended yet and those can be your global and local variables and those created in malloc()'d memory and functions. The OS may extend this range by returning:
pointers to code or data objects not explicitly defined in your C program at its source code level (the OS may let apps access some of its code or data directly, but this is uncommon, or the OS may let apps access some of their parts that are either created by the OS when the app loads or created by the compiler when the app was compiled, one example would be Windows letting apps examine their executable PE image, you can ask Windows where the image starts in the memory)
pointers to data buffers allocated by the OS for/on behalf of apps (here, usually, the OS would use its own APIs and not your app's malloc()/free(), and you'd be required to use the appropriate OS-specific function to release this memory)
OS-specific pointers that can't be dereferenced and only serve as error indicators (e.g. you could have more than just one undereferenceable pointer like NULL and your ERROR_ZERO is a possible candidate)
I would generally discourage use of hard-coded and magic pointers in programs.
If for some reason, a pointer is the only way to communicate error conditions and there are more than one of them, you could do this:
char ErrorVars[5] = { 0 };
void* ErrorPointer1 = &ErrorVars[0];
void* ErrorPointer2 = &ErrorVars[1];
...
void* ErrorPointer5 = &ErrorVars[4];
You can then return ErrorPointer1 through ErrorPointer1 on different error conditions and then compare the returned value against them. There' a caveat here, though. You cannot legally compare a returned pointer with an arbitrary pointer using >, >=, <, <=. That's only legal when both pointers point to or into the same object. So, if you wanted a quick check like this:
if ((char*)(p = myFunction()) >= (char*)ErrorPointer1 &&
(char*)p <= (char*)ErrorPointer5)
{
// handle the error
}
else
{
// success, do something else
}
it would only be legal if p equals one of those 5 error pointers. If it's not, your program can legally behave in any imaginable and unimaginable way (this is because the C standard says so). To avoid this situation you'll have to compare the pointer against each error pointer individually:
if ((p = myFunction()) == ErrorPointer1)
HandleError1();
else if (p == ErrorPointer2)
HandleError2();
else if (p == ErrorPointer3)
HandleError3();
...
else if (p == ErrorPointer5)
HandleError5();
else
DoSomethingElse();
Again, what a pointer is and what its representation is, is compiler- and OS/CPU-specific. The C standard itself does not mandate any specific representation or range of valid and invalid pointers, so long as those pointers function as prescribed by the C standard (e.g. pointer arithmetic works with them). There's a good question on the topic.
So, if your goal is to write portable C code, don't use hard-coded and "magic" pointers and prefer using something else to communicate error conditions.

It completely depends on both the computer and the operating system. For example, on a computer with memory-mapped IO like the Game Boy Advance, you probably don't want to confuse the address for "what color is the upper left pixel" with userland data:
http://www.coranac.com/tonc/text/hardware.htm#sec-memory

You should not be worrying about addresses as a programmer, because it's different on different platforms and between actual hardware addresses and your application you have quite some layers. There's the physical to virtual translation being one of the big ones, and the virtual address space is mapped into memory, and each process has it's own address space, protected at hardware level from other processes, on most modern operating systems.
What you are specifying here are just hexadecimal values, they aren't interpreted as addresses. A pointer set to NULL is essentially saying it doesn't point to anything, not even address zero. It's just NULL. Whatever the value of that may be, depends on platform, compiler and a lot of other things.
Setting a pointer to any other value is not defined. A pointer is a variable that stores the address of another, what you're trying to do is give this pointer some other value than what is valid.

This code:
#define ERROR_NULL 0x0
#define ERROR_ZERO 0x1
int *example(int *a) {
if (*a < 0)
return ERROR_NULL;
if (*a == 0)
return (void *) ERROR_ZERO;
return a;
}
defines a function example that takes input parameter a and returns the output as a pointer to int. At the same time, when the error occurs, this function abuses cast to void* to return the error code to the caller in the same way it returns the correct output data. This approach is wrong, because the caller must know that sometimes valid output is received, but it doesn't actually contain the desired output but the error code instead.
are there any other special memory addresses which never will be used ... ?
... it could effectively be used for error handling
Don't make any assumptions about the possible address that might be returned. When you need to pass a return code to the caller, you should do it in more straightforward way. You could take the pointer to the output data as a parameter and return the error code that identifies success or failure:
#define SUCCESS 0x0
#define ERROR_NULL 0x1
#define ERROR_ZERO 0x2
int example(int *a, int** out) {
if (...)
return ERROR_NULL;
if (...)
return ERROR_ZERO;
*out = a;
return SUCCESS;
}
...
int* out = NULL;
int retVal = example(..., &out);
if (retVal != SUCCESS)
...

Actually NULL(0) is a valid address. But it's not an address that you can typically write to.
From memory, NULL could be a different value on some old VAX hardware with some very old c compiler. Maybe someone can confirm that. It will always be 0 now as the C standard defines it - see this question Is NULL always false?
Typically the way errors are returned from functions is to set errno. You could piggy back on this if the error codes makes sense in the particular situation. However, if you need your own errors then you could do the same thing as the errno method.
Personally I prefer to not return void* but make the function take a void** and return the result there. Then you can return an error code directly where 0 = success.
e.g.
int posix_memalign(void **memptr, size_t alignment, size_t size);
Note the allocated memory is returned in memptr. The result code is returned by the function call. Unlike malloc.
void *malloc(size_t size)

On Linux, on 64-bit and when using the x86_64 architecture (either from Intel or AMD) only 48 bits of the total 64-bit address space are used (hardware limitation AFAIK). Basically, any address after 247 until 262 can be used now as it will not be allocated.
For some background, the virtual address space of a Linux process is made of a user and kernel space. On the above mention architecture, the first 47 bits (128 TB) are used for the user space. The kernel space is used at the end of the spectrum, so the last 128 TB at the end of a full 64-bit address space. In between is terra incognita. Although that could change any time in the future and this is not portable.
But I could think of many other way to return an error than your method, so I do not see the advantage of using such an hack.

TL;DR:
Use -1 if you want just one more error condition beside NULL
For more special conditions just set the least significant bit(s), because the returned value from malloc() family or new is guaranteed to be aligned for any fundamental alignment and will have the low bits always zero, so they're free for use (like in a tagged pointer)
If allocation succeeds, returns a pointer that is suitably aligned for any object type with fundamental alignment.
https://en.cppreference.com/w/c/memory/malloc
Pointers to types wider than char are also always aligned. If you point to a char or a char array on stack then just align as necessary with alignas
For even more conditions you can limit the range of allocated addresses. This needs platform-specific code and there won't be a portable solution
As others said, it highly depends. However if you're on a platform with dynamic allocation then -1 is (extremely likely) a safe value.
That's because the memory allocator gives out memory in BIG BLOCKS instead of just single bytes§. Therefore the last address that can be returned would be -block_size. For example if block_size is 4 then the last block will span across the addresses { -4, -3, -2, -1 }, and the last possible address will be -4 = 0xFFFF...FFFC. As a result, -1 will never be returned by the malloc() family
Various system functions on Linux also return -1 for an invalid pointer instead of NULL, for example mmap() and shmat(). Win32 APIs that return a handle can also return NULL (0) or INVALID_HANDLE_VALUE (-1) for a failure case or an ill-formed handle. They have to do that because sometimes NULL is a valid memory address. In fact if you're on a Harvard architecture then location zero in the data space is quite usable. And even on von Neumann architectures then what you said
"NULL is just the special memory address 0x0, which is never used for anything but to indicate the occurrence of a special condition"
is still wrong, because the address 0 is also valid. It's just that most modern OSes map the page zero somehow to make it trap when user space code dereferences it. Yet the page is accessible from within kernel code. There were some exploits related to NULL pointer dereference bug in Linux kernel
In fact, quite contrary to the zero page's original preferential use, some modern operating systems such as FreeBSD, Linux and Microsoft Windows actually make the zero page inaccessible to trap uses of NULL pointers. This is useful, as NULL pointers are the method used to represent the value of a reference that points to nothing
https://en.wikipedia.org/wiki/Zero_page
In MSVC and GCC, a NULL pointer to member is also represented as the bit pattern 0xFFFFFFFF on a 32-bit machine. And in AMD GCN NULL pointer also has a value of -1
You can go even further and return a lot more error codes by exploiting the fact that pointers are normally aligned. For example malloc always "aligns memory suitable for any object type (which, in practice, means that it is aligned to alignof(max_align_t))"
how does malloc understand alignment?
Which guarantees does malloc make about memory alignment?
Nowadays the default alignment for malloc is 8 or 16 bytes depending on whether you're on a 32 or 64-bit OS, which means you'll have at least 3 bits available for error reporting or any purposes of yours. And if you use a pointer to a type wider than char then it's always aligned. So generally there's nothing to worry about unless you want to return a char pointer that's not output from malloc (in which case you can align easily). Just check the least significant bit to see whether it's a valid pointer or not
int* result = func();
if ((uintptr_t)result & 1)
error_happened(); // now the high bits can be examined to check the error condition
In case of 16-byte alignment then the last 4 bits of a valid address are always 0s, and the total number of valid addresses is only ¹⁄₁₆ the total number of bit patterns, which means you can return at most ¹⁵⁄₁₆×264 error codes with a 64-bit pointer. Then there's aligned_alloc if you want more least significant bits.
That trick has been used for storing some information in the pointer itself. On many 64-bit platforms you can also use the high bits to store more data. See Using the extra 16 bits in 64-bit pointers
You can even go to the far extreme by limiting the range of the allocated pointers with some help from the OS. For example if you specify that the pointers must be allocated in the range 2-3GB then any addresses below 2GB and above 3GB will be available for you to indicate an error condition. On how to do that see:
Allocating Memory Within A 2GB Range
How can I ensure that the virtual memory address allocated by VirtualAlloc is between 2-4GB
Allocate at low memory address
How to malloc in address range > 4 GiB
Custom heap/memory allocation ranges
See also
Is ((void *) -1) a valid address?
§ That's obvious since some information about the allocated block need to be stored for bookkeeping, therefore the block size must be much larger than the block itself, otherwise the metadata itself will be even bigger than the amount of RAM. Thus if you call malloc(1) then it still have to reserve a full block for you.

Related

Can the following code be true for pointers to different things

I have a piece of memory I am "guarding", defined by
typedef unsigned char byte;
byte * guardArea;
size_t guardSize;
byte * guardArea = getGuardArea();
size_t guardSize = getGuardSize();
An acceptable implementation for the sake of this would be:
size_t glGuardSize = 1024; /* protect an area of 1kb */
byte * getGuardArea()
{
return malloc( glGuardSize );
}
size_t getGuardSize()
{
return glGuardSize;
}
Can the following snippet return true for any pointer (from a different malloc, from the stack etc)?
if ( ptr >= guardArea && ptr < (guardArea + guardSize)) {
return true;
}
The standard states that:
values within the area will return true. (When ptr was a member, all acts correctly.)
pointers will be distinct (a == b only if they are the same).
all addresses within the byte array can be accessed by incrementing the base.
any pointer can be converted to and from a char *, without damage.
So I can't understand how the result could be true for any pointer from a different object (as it would break the distinct rule for one of the pointers within the area).
Edit:
What is the use case?
The ability to detect whether a pointer is within a region is really important, at some point code is written
if ( isInMyAreaOfInterest( unknownPointer ) ) {
doMySpecialThing( unknownPointer );
} else {
doSomethingElse( unknownPointer );
}
I think the language needs to support the developer by making such constructs simple and obvious, and our interpretation of the standard, is that the developer needs to cast to int. Due to the "undefined behavior" of pointer comparisons of distinct objects.
I was hoping for some clarity of why I can't do what I would like (my snippet), as all the posts on SO I found say that the standard claims undefined behavior, without any explanation, or examples of why the standard is better than how I would like it to work.
At the moment, we have a rule, we are neither understanding why the rule exists, or questioning if the rule is helping us
Example posts:
SO: checking if a pointer is in a malloced area
SO: C compare pointers
It is still possible for an allocation to generate a pointer that satisfies the condition despite the pointer not pointing into the region. This will happen, for example, on an 80286 in protected mode, which is used by Windows 3.x in Standard mode and OS/2 1.x.
In this system, pointers are 32-bit values, split into two 16-bit parts, traditionally written as XXXX:YYYY. The first 16-bit part (XXXX) is the "selector", which chooses a bank of 64KB. The second 16-bit part (YYYY) is the "offset", which chooses a byte within that 64KB bank. (It's more complicated than this, but let's just leave it at that for the purpose of this discussion.)
Memory blocks larger than 64KB are broken up into 64KB chunks. To move from one chunk to the next, you add 8 to the selector. For example, the byte after 0101:FFFF is 0109:0000.
But why do you add 8 to move to the next selector? Why not just increment the selector? Because the bottom three bits of the selector are used for other things.
In particular, the bottom bit of the selector is used to choose the selector table. (Let's ignore bits 1 and 2 since they are not relevant to the discussion. Assume for convenience that they are always zero.)
There are two selector tables, the Global Selector Table (for memory shared across all processes) and the Local Selector Table (for memory private to a single process). Therefore, the selectors available for process private memory are 0001, 0009, 0011, 0019, etc. Meanwhile, the selectors available for global memory are 0008, 0010, 0018, 0020, etc. (Selector 0000 is reserved.)
Okay, now we can set up our counter-example. Suppose guardArea = 0101:0000 and guardSize = 0x00020000. This means that the guarded addresses are 0101:0000 through 0101:FFFF and 0109:0000 through 0109:FFFF. Furthermore, guardArea + guardSize = 0111:0000.
Meanwhile, suppose there is some global memory that happens to be allocated at 0108:0000. This is a global memory allocation because the selector is an even number.
Observe that the global memory allocation is not part of the guarded region, but its pointer value does satisfy the numeric inequality 0101:0000 <= 0108:0000 < 0111:0000.
Bonus chatter: Even on CPU architectures with a flat memory model, the test can fail. Modern compilers take advantage of undefined behavior and optimize accordingly. If they see a relational comparison between pointers, they are permitted to assume that the pointers point into the same array (or one past the last element of that array). Specifically, the only pointers that can legally be compared with guardArea are the ones of the form guardArea, guardArea+1, guardArea+2, ..., guardArea + guardSize. For all of these pointers, the condition ptr >= guardArea is true and can therefore be optimized out, reducing your test to
if (ptr < (guardArea + guardSize))
which will now be satisfied for pointers that are numerically less than guardArea.
Moral of the story: This code is not safe, not even on flat architectures.
But all is not lost: The pointer-to-integer conversion is implementation-defined, which means that your implementation must document how it works. If your implementation defines the pointer-to-integer conversion as producing the numeric value of the pointer, and you know that you are on a flat architecture, then what you can do is compare integers rather than pointers. Integer comparisons are not constrained in the same way that pointer comparisons are.
if ((uintptr_t)ptr >= (uintptr_t)guardArea &&
(uintptr_t)ptr < (uintptr_t)guardArea + (uintptr_t)guardSize)
Yes.
void foo(void) {}
void(*a) = foo;
void *b = malloc(69);
uintptr_t ua = a, ub = b;
ua and ub are in fact permitted to have the same value. This occurred frequently on segmented systems (like MS-DOS) which might put code and data in separate segments.

The maximum memory location my C stack pointer can points to during initialization

Consider the following code in a linux machine with 32 bit OS:
void foo(int *pointer){
int *buf;
int *buf1 = pointer;
....
}
What is the maximum memory address buf and buf1 can point to using the above declaration (OS allocates the address)? E.g., can it point to address 2^32-200?
The reason I asked is that I may do pointer arithmetic on these buffers and I am concern that this pointer arithmetic can wrap around. E.g., assume the len is smaller than the size of buf and buf1. Assume some_pointer points to the end of the buffer.
unsigned char len = 255;
if(buf + len > some_pointer)
//do something
if(buf1 + len > some_pointer)
//do something
The standard says that
For two elements of an array, the address of the element with the lower subscript will always compare less to the address of the object with the higher subscript.
Comparing any two elements that are not part of the same aggregate (array or struct) is undefined behavior.
So if buf + len and some_pointer point to elments in the same array as buf (or one past the array), you don't have to worry about wrap arround. If one of them doesn't, you have undefined behavior anyway.
You shouldn't ever rely on the addresses provided by the allocator falling within a specific range. Even if you could show that on a particular Linux setup, malloc can only generate addresses between X and Y, there is no guarantee--it could change with any future update. The only guarantee from malloc is that successful allocations won't start at NULL (address 0 in code, for Linux and most other typical platforms).
Yes, for a 32 bit or 64 bit OS. Whether there's anything usable there, or if you'll get an access violation trying to dereference the pointer, is up to the compiler and OS.
The OS can map pages of physical memory anywhere in the address space. The addresses you see don’t correspond to physical RAM chips at all. The OS might, for example, have virtual memory or copy-on-write pages.

Stuffing a -1 in a pointer as a special value

In C, can one stuff a -1 value (e.g. 0xFFFFFFFF) into a pointer, using an approach such as this one, and expect that such memory address is never allocated at runtime?
The idea is that the pointer value be used as a memory address, except if it has this "special" -1 value. The pointer should be considered memory address even if it is NULL (in which case, the object to which it points to has not yet been built).
I understand this may be platform dependent, but the program in question is expected to run in Linux, Windows and MacOSX.
The problem at hand is much larger than what is described here, so comments or answers which question this approach are not useful. I know it's a bit hacky, but the alternative is a major refactor :/
Thanks in advance.
It is GRAS (generally recognized as safe). No major OS will allocate memory that would collide with your chosen sentinel. However, there are a few pathological cases where it would be invalid to make this assumption. For instance, a pathological C++ compiler may choose to start the stack at 0xFFFFFFFF, without violating any constraints in the spec.
Within just the scope of sane OS's, it is nearly impossible to have 0xFFFFFFFF (or its 64-bit equivalent) to be a valid memory address. It cannot be a valid memory address of an array (C++ rules forbid it). It could technically be a valid index of a char of an object allocated at the end of space, but there's two things that prevent that.
Most OSs have some padding
Most OSs use high memory values as Kernel memory.
If you have an opportunity to use a global value as a sentinel, it is guaranteed to be safe.
char sentinel;
char* p = "Hello";
char* p2 = 0; // null pointer
char* p3 = &sentinel;
if (p3 == &sentinel)
cout << "p3 was a sentinel" << endl;
One way to define a sentinel value that no other valid address will coincide with is a static variable:
static t sentinel;
t *p = &sentinel;
If you are going to assume a flat address space and that all pointers have the same width, you can minimize the overhead by declaring sentinel of type char instead of t.
To answer your question about (t*)-1:
-1 has type int. I would recommend (t*)(uintptr_t)-1, which is more likely to be the last address even for a 64-bit flat address space.
it is not very clean, but it should work on all commonplace architectures because, as long as the compiler intends to compare pointers using the unsigned comparison assembly instruction (as it usually does), for any object a that the compiler could hope to place at the end of the address space, &a + 1 has to compare greater than &a. In practice, this prevents the last address to be used to store anything.

Initializing variable at address zero in C

This may be a pretty basic question. I understand that there is a C convention to set the value of null pointers to zero. Is it possible that you can ever allocate space for a new variable in Windows, and the address of that allocated space happens to be zero? If not, what usually occupies that address region?
On MS-DOS the null pointer is a fairly valid pointer and due to the OS running in real mode it was actually possible to overwrite the 0x0 address with garbage and corrupt the kernel. You could do something like:
int i;
unsigned char* ptr = (unsigned char *)0x0;
for(i = 0; i < 1024; i++)
ptr[i] = 0x0;
Modern operating systems (e.g. Linux, Windows) run in protected mode which never gives you direct access to physical memory.
The processor will map the physical addresses to virtual addresses that your program will make use of.
It also keeps track of what you access and dare you touch something not belonging to you will you be in trouble (your program will segfault). This most definitely includes trying to dereference the 0x0 address.
When you "set the value of a pointer to zero" as in
int *p = 0;
it will not necessarily end up pointing to physical address zero, as you seem to believe. When a pointer is assigned a constant zero value (or initialized with it), the compiler is required to recognize that situation and treat it in a special way. The compiler is required to replace that zero with implementation-dependent null-pointer value. The latter does not necessarily point to zero address.
Null pointer value is supposed to be represented by a physical address that won't be used for any other purpose. If in some implementation physical address zero is a usable address, then such implementation will have to use a different physical address to represent null pointers. For example, some implementation might use address 0xFFFFFFFF for that purpose. In such implementation the initialization
int *p = 0;
will actually initialize p with physical 0xFFFFFFFF, not with physical zero.
P.S. You might want to take a look at the FAQ: http://c-faq.com/null/index.html, which is mostly dedicated to exactly that issue.
The value 0 has no special meaning. It is a convention to set a pointer to 0 and the C compiler has to interpret it accordingly. However, there is no connection to the physical address 0 and in fact, that address can be a valid address. In many systems though the lower adresses are containing hardware related adresses, like interrupt vectors or other. On the Amiga for example, the address 4 was the entry point into the operating system, which is also an arbitrary decision.
If the address of allocated space is zero, there is insufficient memory available. That means your variable could not be allocated.
The address at 0x0 is where the CPU starts executing when you power it on. Usually at this address there's a jump to the BIOS code and IIRC the first 64K (or more) are reserved for other tasks (determined by the BIOS/UEFI). It's an area which is not accessbile by an application.
Given that it should be clear that you cannot have a variable at address 0x0 in Windows.

How can I access an interrupt vector located at the machine's location 0?

How can I access an interrupt vector located at the machine's location 0? If I set a pointer to 0, the compiler might translate it to some nonzero internal null pointer value.
If you're working in a domain where you need to read/write vectors at address zero (I guess you're on some kind of embedded system?), then you will find lots of what you're doing will be outside a general straight-down-the-middle interpretation of the 'C' standard.
You can pretty safely assume that embedded compilers will generate accesses to the addresses you put into pointers, and not magically change their contents.
However, you can't always assume that a physical address zero in a datasheet is what the processor accesses when you read/write address zero. If there's an MMU in the processor, then you might need to go through a logical -> physical mapping process of some kind, and even if there isn't a full MMU, many modern small embedded processor play games with the address space around interrupt vectors (booting out of flash and then optionally remapping that part of the address space to RAM, for example.)
You are correct that depending on your C implementation, the internal null pointer might not be 0. But is usually is. If you want to be very safe, you can typecast the value, or use memset().
uintptr_t zerobits = 0;
void *pointer1 = (void *)zerobits;
or:
void *pointer2;
memset(&pointer2, 0, sizeof pointer2);
Since whatever is at location 0 is obviously machine dependent, you're free to use whatever machine-dependent trick will work to get there. Read your vendor's documentation. It's likely that if it's at all meaningful for you to be accessing location 0, the system will be set up to make it reasonably easy to do so. Some possibilities are:
Simply set a pointer to 0. (This is the way that doesn't have to work, but if it's meaningful, it probably will.)
Assign the integer 0 to an int variable, and convert that int to a pointer. (This is also not guaranteed to work, but it probably will.)
Use a union to set the bits of a pointer variable to 0:
union {
int u_p;
int u_i; / assumes sizeof(int) >= sizeof(int *) */
} p;
p.u_i = 0;
Use memset to set the bits of a pointer variable to 0:
memset((void *)&p, 0, sizeof(p));
Declare an external variable or array
extern int location0;
and use an assembly language file, or some special linker invocation, to arrange that this symbol refers to (i.e. the variable is placed at) address 0.

Resources