Accessing a Value from pointer which has manually assigned address - c

I have assigned some random address to a pointer of a particular data type. Then I stored a value in that particular address. When I run the program, it terminates abruptly.
char *c=2000;
*c='A';
printf("%u",c);
printf("%d",*c);
I could be able to print the value of c in first printf statement. But I couldn't fetch the value stored in that address through the second one. I have executed in Cygwin GCC compiler and also in online ideone.com compiler. In ideone.com compiler it shows runtime error. What's the reason behind this?

When you assign the address 2000 to the pointer c, you are assuming that will be a valid address. Generally, though, it is not a valid address. You can't choose addresses at random and expect the compiler (and operating system) to have allocated that memory for you to use. In particular, the first page of memory (often 4 KiB, usually at least 1 KiB) is completely off limits; all attempts to read or write there are more usually indicative of bugs than intentional behaviour, and the MMU (memory management unit) is configured to reject attempts to access that memory.
If you're using an embedded microprocessor, the rules might well be different, but on a general purpose o/s like Windows with Cygwin, addresses under 0x1000 (4 KiB) are usually verboten.
You can print the address (you did it unreliably, but presumably your compiler didn't warn you; mine would have warned me about using a format for a 4-byte integer quantity to print an 8-byte address). But you can't reliably read or write the data at the address. There could be machines (usually mainframes) where simply reading an invalid address (even without accessing the memory it points at) generates a memory fault.
So, as Acme said in their answer,you've invoked undefined behaviour. You've taken the responsibility from the compiler for assigning a valid address to your pointer, but you chose an invalid value. A crash is the consequence of your bad decision.

char *c=2000;
Assignment (and initialization) of integer values to pointers is implementation defined behavior.
Implementation-defined behavior is defined by the ISO C Standard in
section 3.4.1 as:
unspecified behavior where each implementation documents how the choice
is made
EXAMPLE An example of implementation-defined behavior is the
propagation of the high-order bit when a signed integer is shifted
right.
Any code that relies on implementation defined behaviour is only
guaranteed to work under a specific platform and/or compiler. Portable
programs should try to avoid such behaviour.

Related

Why is NULL not a valid memory address?

It may sound like a silly question, but since in C, NULL is literally defined as
#define NULL 0
why can't it be a valid memory address? Why can't I dereference it, and why would it be impossible for any data to be at the memory address 0?
I'm sure the answer to this is something like the "the first n bytes of memory are always reserved by the kernel", or something like that, but I can't find anything like this on the internet.
Another part of my reasoning is that, wouldn't this be platform independent? Couldn't I invent a new architecture where the memory address 0 is accessible to processes?
Dereferencing NULL is undefined behavior. Anything could happen, and most of the time bad things happen. So be scared.
Some old architectures (VAX ...) permitted you to derefence NULL.
The C11 standard specification (read n1570) does not require the NULL pointer to be all zero bits ( see C FAQ Q5.17); it could be something else, but it should be an address which is never valid so is not obtainable by a successful malloc or by the address-of operator (unary &), in the sense of C11. But it is more convenient to have it so, and in practice most (but not all) C implementations do so.
IIRC, on Linux, you might mmap(2) the page containing (void*)0 with MAP_FIXED, but it is not wise to do so (e.g. because a conforming optimizing compiler is allowed to optimize dereference of NULL).
So (void*)0 is not a valid address in practice (on common processors with some MMU and virtual memory running a good enough operating system!), because it is convenient to decide that it is NULL, and it is convenient to be sure that derefencing it gives a segmentation fault. But that is not required by the C standard (and would be false on cheap microcontrollers today).
A C implementation has to provide some way to represent the NULL pointer (and guarantee that it is never the address of some valid location). That might even be done by a convention: e.g. provide a full 232 bytes address space, but promise to never use address 0 (or whatever address you assigned for NULL, perhaps 42!)
When NULL happens to be derefencable, subtile bugs are not caught by a segmentation fault (so C programs are harder to debug).
Couldn't I invent a new architecture where the memory address 0 is accessible to processes?
You could, but you don't want to do that (if you care about providing any standard conforming C implementation). You prefer to make address 0 be the NULL. Doing otherwise make harder to write C compilers (and standard C libraries). And make that address invalid to the point of giving a segmentation fault when derefencing make debugging (and the life of your users coding in C) easier.
If you dream of weird architectures, read about Lisp machines (and Rekursiv, and iapx 432) and see The circuit less traveled talk at FOSDEM2018 by Liam Proven. It really is instructive, and it is a nice talk.
Making address zero unmapped so that a trap occurs if your program tries to access it is a convenience provided by many operating systems. It is not required by the C standard.
According to the C standard:
NULL is not be the address of any object or function. (Specifically, it requires that NULL compare unequal to a pointer to of any object or function.)
If you do apply * to NULL, the resulting behavior is not defined by the standard.
What this means for you is that you can use NULL as an indicator that a pointer is not pointing to any object or function. That is the only purpose the C standard provides for NULL—to use is tests such as if (p != NULL)…. The C standard does not guarantee that if you use *p when p is NULL that a trap will occur.
In other words, the C standard does not require NULL to provide any trapping capability. It is just a value that is different from any actual pointer, provided just so you have one pointer value that means “not pointing to anything.”
General-purpose operating systems typically arrange for the memory at address zero to be unmapped (and their C implementations define NULL to be (void *) 0 or something similar) specifically so that a trap will occur if you dereference a null pointer. When they do this, they are extended the C language beyond what the specification requires. They deliberately exclude address zero from the memory map of your process to make these traps work.
However, the C standard does not require this. A C implementation is free to leave the memory at address zero mapped, and, when you apply * to a null pointer, there might be data there, and your program could read and/or write that data, if the operating system has allowed it. When this is done, it is most often in code intended to run inside the operating system kernel (such as device drivers, kernel extensions, or the kernel itself) or embedded systems or other special-purpose systems with simple operating systems.
The null pointer constant (NULL) is 0-valued. The null pointer value may be something other than 0. During translation, the compiler will replace occurrences of the null pointer constant with the actual null pointer value.
NULL does not represent “address 0”; rather, it represents a well-defined invalid pointer value that is guaranteed not to point to any object or function, and attempts to dereference invalid pointers lead to undefined behavior.

how to validate a pointer using built in functions other than a NULL check?

During a discussion today I came across that there are checks in the VxWorks and in LynxOS which tells you that the address you assign for a pointer is from a valid range. This the first time I am hearing about this code like I assign int *i=&variable;.
I should get a warning or error which says that In my application I cannot assign the address value to the integer.
Like while I do a NULL check I am only checking the address 0x00000000. But there can be the case the address might be 0x00000001. Which is also an invalid case if its an unmapped area and might not be accessible. Is any one aware of some thing similar for Linux or can guide how its done in VxWorks or LynxOS.
Any ideas??
The function you seek in VxWorks is called vxMemProbe.
Basically the vxMemProbe libraries insert special exception handling code to catch a page fault or bus error. The vxMemProbe function is used to check if the address is valid for read or write. It also allows you to test if the particular address is accessible with a given data width (8,16,32,64 bits) and alignment.
The underlying mechanism of vxMemProbe is tied to the specific architectures exception handling mechanisms. The vxMemProbe libraries insert code into the exception handlers. When you probe an address that triggers an exception the handler checks to see if vxMemProbe triggered the exception. If so, then the handler restores the state processor prior to the exception and returns execution to where vxMemProbe was called while also returning value via the architectures given calling conventions.
In general you can't do what you want, as explained in Felix Palmen's answer.
I should get a warning or error which says that In my application I cannot assign the address value to the integer.
Statically and reliably detecting all pointer faults is impossible (because it could be proven equivalent to solving the halting problem). BTW you might consider using static program analysis tools like Frama-C.
On Linux, in principle, you might test at runtime if a given address is valid in your virtual address space by e.g. using /proc/, e.g. by parsing the /proc/self/maps pseudo textual file (to understand what I mean try cat /proc/$$/maps in a terminal, then cat /proc/self/maps). See proc(5). In practice I don't recommend doing that often (it probably would be too slow), and of course it is not a builtin function of the compiler (you should code it yourself). BTW, be aware of ASLR.
However, there are tools to help detect (some of) the faulty address uses, in particular valgrind and the address sanitizer facility, read about instrumentation options of GCC and try to compile with -fsanitize=address ...
Don't forget to compile your code with all warnings and debug info, so use gcc -Wall -Wextra -g to compile it.
BTW, if you store in some global pointer the address of some local variable and dereference that pointer after that local variable is in scope, you still have some undefined behavior (even if your code don't crash, because you usually dereference some random address on your call stack) and you should be very scared. UB should be always avoided.
There are several misconceptions here:
From the perspective of the language C, there's only one pointer value that's guaranteed to be invalid, and this is NULL. For other values, it depends on the context. A pointer is valid when it points to an object that is currently alive. (Note that this is trivially true in your int *i = &variable example, as this is only valid syntax when there is a variable accessible from your current scope)
NULL does not necessarily mean a value with all bits zero. This is the most common case, but there can be platforms that use a different bit pattern for the NULL pointer. It's even allowed by the C standard that pointers of different types have different representations for NULL. Still, converting 0 to a pointer type is guaranteed to result in the NULL pointer for this type.
I don't know what exactly you're referring to in VxWorks, but of course Linux checks memory accesses. If a process tries to access an address that's not mapped in the virtual address space, this process is sent a SIGSEGV signal, which causes immediate abnormal program termination (Segmentation fault).

what is the meaning of (char*)1?

What is the meaning of this?
ptr=(char *)1;
in c? I searched many times but I couldn't find the meaning of this.
I know the meaning of pointer, you don't have to explain.
It converts the integer 1 to a pointer char*. Which means "at address 1, I expect to have a character". Please note that this is not guaranteed to work, it depends on the system. In case of memory misalignment, this would cause undefined behavior.
This particular code is most likely not meaningful on any system. On systems with virtual addresses (such as a PC), you probably can't access address 1 directly. On systems where it is possible, you never want to use char* but uint8_t*. For example, many small microcontroller systems have various byte-sized hardware registers at address 1.

What's behind NULL?

In many C implementations (but not all), NULL corresponds to a process's logical address 0. When we try to dereference NULL, we get a segmentation fault because NULL "maps" to a memory segment that is neither readable nor writable.
Are there systems with real, physical addresses that correspond to the virtual NULL address? If so, what might be stored there? Or is NULL purely a virtual address with no physical counterpart?
NULL is implementation defined.
It is a macro which expands to an implementation defined null pointer constant.
"it depends" #I3x
Is there a real, physical address that corresponds to the virtual NULL address?
On some platforms: yes, on others: no. It is implementation defined.
If so, what might be stored there?
On some platforms: a value, maybe 0. This value may/may not be accessible. On others: there is simple no access. It is implementation defined.
Or is NULL purely a virtual address with no physical counterpart?
On some platforms: it is a virtual address with no counterpart. On others it may be a real address. It is implementation defined.
From C-standard paper ISO/IEC 9899:201x (C11) §6.3.2.3 Pointers:
An integer constant expression with the value 0, or such an
expression cast to type void *, is called a null pointer
constant. If a null pointer constant is converted to a pointer
type, the resulting pointer, called a null pointer, is guaranteed to
compare unequal to a pointer to any object or function
This means that a NULL pointer should give a result TRUE if compared to a 0 value, and also that it will resolve to a value=0 if converted to an arithmetic type.
But it also mean that the compiler must generate a value for the null pointer that is guaranteed not to be an address existing (compare equal) in the code running environment (this should be more standardish).
The scope is to make the code portable imposing that each operation on NULL pointer must give same result on any CPU or architecture.
Note: on practical side to guarantee that the NULL pointer doesn't equals any address in the running environment there are two ways. The first will apply to those platforms where invalid address exists, getting advantage of such invalid addresses. The second is to conventionally reserve an address for the scope, typically 0. The latter means that the platform, the OS mainly, have to take care to make that address invalid in the sense that it isn't usable and then can never equal any other pointer, else there will always be an object that can legally use that address and consequently make valid its address comparison to the NULL pointer. This point is important because missing any infrastructure, as an OS could be, that takes care of the invalid value we can have a conflicting condition. An example is when using C compiler for base programming, OS and kernel, if the processor use the memory from address 0 to hold interrupt service addresses a pointer to the object will have a value of 0. In this case the compiler should use a different value of NULL pointer, but if the OS can access the whole addressing space. to allow full Memory Management, there will be no value applicable for NULL pointer.
On user or kernel HL code this is not a problem because there arealways some addresses that are normally not allowed and can then be used for the NULL pointer value.
Yes, most (all?) systems have a real physical address zero, it is just inaccessible from a process running behind a virtual address-space which remaps virtual addresses to (different) physical addresses. In those processes the address zero is simply left unmapped to anything, just as most of the rest of their address space.
However, as a matter of fact, all the x86 based processors till this day boot into the 'real mode' at first. This is a 8086 compatibility mode. Under this processor mode, not only there is a physical addressable address zero, but this address is actually where the interrupt vector table located at, and this is where the processor reads from when it handles interrupts.
Certainly on some computers, there is actual memory at address 0. On others, there may not be. There are even some hardware designs for which the answer may depend on the temporary state of a memory-mapping chip or chipset.
But in all cases, dereferencing a NULL pointer is undefined behavior.
In many C implementations (but not all), NULL corresponds to a process's logical address 0.
No, that is the wrong way to look at it. Null pointers do not correspond to anything other than themselves, by definition, and C in no way ties null pointers to any particular address, in any sense of the term.
At best, you could say that some implementations' representations of null pointers have the form that a representation of a pointer to an object residing at machine address 0 would have. Since null pointers compare unequal to every pointer to an object or function, however, such a situation requires that it be impossible for any object or function that the C program can reference to reside at that address.
When we try to dereference NULL, we get a segmentation fault
That is one of the multitude of observed behaviors that are possible when you invoke undefined behavior. Empirically, it is one of the more likely to be observed when undefined behavior is invoked in the particular way you describe. It is by no means guaranteed by C, and indeed, there are widely-used implementations which, empirically, exhibit different behavior.
because NULL "maps" to a memory segment that is neither readable nor writable.
But there you're delving deep into implementation details.
Is there a real, physical address that corresponds to the virtual NULL address?
No, because a null pointer is not an address. It is a pointer value that is explicitly invalid to dereference. Technically, although C refers to some pointer values as "addresses", it neither requires nor depends on any pointer value to be a machine address, neither a physical nor a virtual one. The value of a null pointer, however, is not an address even in C's sense of the term. The & operator never produces this value, and it is invalid to dereference -- it explicitly fails to correspond to any object or function.
If so, what might be stored there? Or is NULL purely a virtual address with no physical counterpart?
Although it is indeed common for null pointer values to be represented in the way that a pointer to an object / function residing at machine address 0 would be, this is irrelevant, because in that case a C program can never access anything residing at that address. Thus, as far as the program is concerned, nothing resides there. As far as the hardware and OS kernel are concerned, on the other hand, external programs' view of memory is largely irrelevant. And if one nevertheless insists on considering the kernel's view of what is stored in physical memory at an address derived from an external program's null pointer representation, it is highly system-dependent.

Lowest possible memory address on modern OS

I've recently been pointed into one of my C programs that, should the start address of the memory block be low enough, one of my tests would fail as a consequence of wrapping around zero, resulting in a crash.
At first i thought "this is a nasty potential bug", but then, i wondered : can this case happen ? I've never seen that. To be fair, this program has already run millions of times on a myriad of systems, and it never happened so far.
Therefore, my question is :
What is the lowest possible memory address that a call to malloc() may return ? To the best of my knowledge, i've never seen addresses such as 0x00000032 for example.
I'm only interested in "modern" environments, such as Linux, BSD and Windows. This code is not meant to run on a C64 nor whatever hobby/research OS.
First of all, since that's what you asked for, I'm only going to consider modern systems. That means they're using paged memory and have a faulting page at 0 to handle null pointer dereferences.
Now, the smallest page size I'm aware of on any real system is 4k (4096 bytes). That means you will never have valid addresses below 0x1000; anything lower would be part of the page containing the zero address, and thus would preclude having null pointer dereferences fault.
In the real world, good systems actually keep you from going that low; modern Linux even prevents applications from intentionally mapping pages below a configurable default (64k, I believe). The idea is that you want even moderately large offsets from a null pointer (e.g. p[n] where p happens to be a null pointer) to fault (and in the case of Linux, they want code in kernelspace to fault if it tries to access such addresses to avoid kernel null-pointer-dereference bugs which can lead to privilege elevation vulns).
With that said, it's undefined behavior to perform pointer arithmetic outside of the bounds of the array the pointer points into. Even if the address doesn't wrap, there are all sorts of things a compiler might do (either for hardening your code, or just for optimization) where the undefined behavior could cause your program to break. Good code should follow the rules of the language it's written in, i.e. not invoke undefined behavior, even if you expect the UB to be harmless.
You probably mean that you are computing &a - 1 or something similar.
Please, do not do this, even if pointer comparison is currently implemented as unsigned comparison on most architectures, and you know that (uintptr_t)&a is larger than some arbitrary bound on current systems. Compilers will take advantage of undefined behavior for optimization. They do it now, and if they do not take advantage of it now, they will in the future, regardless of “guarantees” you might expect from the instruction set or platform.
See this well-told anecdote for more.
In a completely different register, you might think that signed overflow is undefined in C because it used to be that there were different hardware choices such as 1's complement and sign-magnitude. Therefore, if you knew that the platform was 2's complement, an expression such as (x+1) > x would detect MAX_INT.
This may be the historical reason, but the reasoning no longer holds. The expression (x+1) > x (with x of type int) is optimized to 1 by modern compilers, because signed overflow is undefined. Compiler authors do not care that the original reason for undefinedness used to be the variety of available architectures. And whatever undefined thing you are doing with pointers is next on their list. Your program will break tomorrow if you invoke undefined behavior, not because the architecture changed, but because compilers are more and more aggressive in their optimizations.
Dynamic allocations are performed on heap. Heap resides in a process address space just after the text (the program code), initialized data and uninitialized data sections, see here: http://www.cprogramming.com/tutorial/virtual_memory_and_heaps.html . So the minimal possible address in the heap depends on the size of these 3 segments thus there is no absolute answer since it depends on the particular program.

Resources