Declare a pointer to an integer at address 0x200 in memory - c

I have a couple of doubts, I remember some where that it is not possible for me to manually put a variable in a particular location in memory, but then I came across this code
#include<stdio.h>
void main()
{
int *x;
x=0x200;
printf("Number is %lu",x); // Checkpoint1
scanf("%d",x);
printf("%d",*x);
}
Is it that we can not put it in a particular location, or we should not put it in a particular location since we will not know if it's a valid location or not?
Also, in this code, till the first checkopoint, I get output to be 512.
And then after that Seg Fault.
Can someone explain why? Is 0x200 not a valid memory location?

In the general case - the behavior you will get is undefined - everything can happen.
In linux for example, the first 1GB is reserved for kernel, so if you try to access it - you will get a seg fault because you are trying to access a kernel memory in user mode.
No idea how it works in windows.
Reference for linux claim:
Currently the 32 bit x86 architecture is the most popular type of
computer. In this architecture, traditionally the Linux kernel has
split the 4GB of virtual memory address space into 3GB for user
programs and 1GB for the kernel.

Adding to what #amit wrote:
In windows it is the same. In general it is the same for all protected-mode operating systems. Since DOS etc. are no longer around it is the same with all systems except kernel-mode (km-drivers) and embedded systems.
The operating system manages which memory-pages you are allowed to write to and places markers that will make the cpu automatically raise access-violations if some other page is written to.

Up until the "checkpoint", you haven't accessed memory location 0x200, so everything works fine.
There I'd a local variable x in the function main. It is of type "pointer to int". x is assigned the value 0x200, and then that value is printed. But the target of x hasn't been accessed, so up to this point it doesn't matter whether x holds a valid memory address or not.
Then scanf tries to write to the memory address you passed in, which is the 0x200 stored in x. Then you get a seg fault, which is certainly sac possible result of trying to write to an arbitrary memory address.
So what are your doubts? What makes you think that this might work, when you come across this code that clearly doesn't?
Writing to a particular memory address might work under certain conditions, but is extremely unlikely to in general. Under all modern OSes, normal programs do not have control over their memory layout. The OS decides where initial things like the program's code, stack, and globals go. The OS will probably also be using some memory space, and it is not required to tell you what it's using. Instead you ask for memory (either by making variables or by calling memory allocation routines), and you use that.
So writing to particular addresses is very very likely to get either memory that hasn't been allocated, or memory that is being used for some other purpose. Neither of those is good, even if you do manage to hit an address that is actually writable. What if you clobber sundry some piece of data used by one of your program's other variables? Or some other part of your program clobbers the value you just wrote?
You should never be choosing a particular hard-coded memory address, you should be using an address of something you know is a variable, or an address you got from something like malloc.

Related

Is reading from unallocated memory safe?

Is reading from a random address safe? I know writing is undefined behaviour but how about reading only?
Well, in many visual debuggers, I can see the contents of the memory in an arbitrary address. How is this done?
Since the behavior is undefined, the answer is undefined - or at the very least, erratic.
If you get lucky and the random address is within the memory bounds of your program, it would be fine to read most likely and you'd just get random junk.
If it's outside of the scope, (i.e. 0x0/NULL), you'd most likely get a segmentation fault (although again, this isn't guaranteed) which would terminate your program - if you'd consider this "safe" then yes, otherwise no.
No, it is not safe. Even if you don't care about the value being defined or accurate, there is such a thing as memory mapped IO, so a random address could interact with peripheral hardware. I did that in the days before protected memory, and yes, it can bring down the system.
Nowadays, depending on your system, I'd expect to see a segfault for addresses outside your process space. Without that protection, a bad app could access valuable data, like passwords, credit card info, etc. when used in a good app.
Also, addresses you see in the debugger are likely not real, physical addresses. Instead, you probably only see virtual memory addresses.

Access process memory directly

simple question:
Is it possible, and how is it possible, to acess the Virtual Memory of my program directly?
To be specific,
instead of typing
int someValue = 5;
can I do something like this:
VirtualMemory[0x0] = (int)5;
I'm just asking because I want the values to be stored next to each other to get a nice and small memory map.
When I look into assembler basics, the processor stores values directly after each other and I was wondering how to do so in c.
Thanks for all of your replies.
Cheers,
Lucky
Not exactly, because in the source code you don't know which memory address your program is going to be "loaded into". So all memory addresses in the program are encoded in an "offset from the start of program" type manner.
Part of the "process loader"'s responsibility in copying the program into memory is to add the "base offset pointer" to all the other offesets, so all the "names" describing memory addresses refer to actual memory addresses instead of "offsets from the beginning of the program".
That's generally a good thing, as if they were encoded directly, two programs that needed the same set of addresses couldn't be run at the same time without corrupting each other's shared memory. In addition, loading a program into a different starting address would not be possible, as walking outside of the memory of your program (nearly guaranteed if you relocate the program without rewriting the memory address references) is going to raise a segfault in the operating system's memory management monitors.
Also you need a name to start at, and this means that the offsets are bound to the variable names. Generally it is much easier to do fishing around in the heap based off of an alloc'd item than it is to truly find the start of the program loaded in memory (because the C programming language doesn't really capture that address into a in-language variable name, and the layout is somewhat system dependent).

Accessing memory below the stack on linux

This program accesses memory below the stack.
I would assume to get a segfault or just nuls when going out of stack bounds but I see actual data. (This is assuming 100kb below stack pointer is beyond the stack bounds)
Or is the system actually letting me see memory below the stack? Weren't there supposed to be kernel level protections against this, or does that only apply to allocated memory?
Edit: With 1024*127 below char pointer it randomly segfaults or runs, so the stack doesn't seem to be a fixed 8MB, and there seems to be a bit of random to it too.
#include <stdio.h>
int main(){
char * x;
int a;
for( x = (char *)&x-1024*127; x<(char *)(&x+1); x++){
a = *x & 0xFF;
printf("%p = 0x%02x\n",x,a);
}
}
Edit: Another wierd thing. The first program segfaults at only 1024*127 but if I printf downwards away from the stack I don't get a segfault and all the memory seems to be empty (All 0x00):
#include <stdio.h>
int main(){
char * x;
int a;
for( x = (char *)(&x); x>(char *)&x-1024*1024; x--){
a = *x & 0xFF;
printf("%p = 0x%02x\n",x,a);
}
}
When you access memory, you're accessing the process address space.
The process address space is divided into pages (typically 4 KB on x86). These are virtual pages: their contents are held elsewhere. The kernel manages a mapping from virtual pages to their contents. Contents can be provided by:
A physical page, for pages that are currently backed by physical RAM. Accesses to these happen directly (via the memory management hardware).
A page that's been swapped out to disk. Accessing this will cause a page fault, which the kernel handles. It needs to fill a physical page with the on-disk contents, so it finds a free physical page (perhaps swapping that page's contents out to disk), reads in the contents from disk, and updates the mapping to state that "virtual page X is in physical page Y".
A file (i.e. a memory mapped file).
Hardware devices (i.e. hardware device registers). These don't usually concern us in user space.
Suppose that we have a 4 GB virtual address space, split into 4 KB pages, giving us 1048576 virtual pages. Some of these will be mapped by the kernel; others will not. When the process starts (i.e. when main() is invoked), the virtual address space will contain, amongst other things:
Program code. These pages are usually readable and executable.
Program data (i.e. for initialised variables). This usually has some read-only pages and some read-write pages.
Code and data from libraries that the program depends on.
Some pages for the stack.
These things are all mapped as pages in the 4 GB address space. You can see what's mapped by looking at /proc/(pid)/maps, as one of the comments has pointed out. The precise contents and location of these pages depend on (a) the program in question, and (b) address space layout randomisation (ASLR), which makes locations of things harder to guess, thereby making certain security exploitation techniques more difficult.
You can access any particular location in memory by defining a pointer and dereferencing it:
*(unsigned char *)0x12345678
If this happens to point to a mapped page, and that page is readable, then the access will succeed and yield whatever's mapped at that address. If not, then you'll receive a SIGSEGV from the kernel. You could handle that (which is useful in some cases, such as JIT compilers), but normally you don't, and the process will be terminated. As noted above, due to ASLR, if you do this in a program and run the program several times then you'll get non-deterministic results for some addresses.
There is usually quite a bit of accessible memory below the stack pointer, because that memory is used when you grow the stack normally. The stack itself is only controlled by the value of the stack pointer - it is a software entity, not a hardware entity.
However, system code may assume typical stack usage. I. e., on some systems, the stack is used to store state for a context switch, while a signal handler runs, etc. This also depends on whether the hardware automatically switches stack pointers when leaving user mode. If the system does use your stack for this, it will clobber the data you stored there, and that can really happen at every point in your program.
So it is not safe to manipulate stack memory below the stack pointer. It's not even safe to assume that a value that has successfully been written will still be the same in the next line code. Only the portion above the stack pointer is guaranteed not to be touched by the runtime/kernel.
It goes without saying, that this code invokes undefined behavior. The pointer arithmetic is invalid, because the address &x-1024*127 is not allocated to the variable x, so that dereferencing this pointer invokes undefined behavior.
This is undefined behavior in C. You're accessing a random memory address which, depending on the platform, may or may not be on the stack. It may or may not be in memory this user can access; if not you will get a segfault or similar. There are absolutely no promises either way.
Actually, it's not undefined behaviour, it's pretty well defined. Accessing memory locations through pointers is and was always defined since C is as close to the hardware as it can be.
I however agree that accessing hardware through pointers when you don't know exactly what you're doing is a dangerous thing to do.
Don't Do That. (If you're one of the five or six people who has a legitimate reason to do this, you already know it and don't need our advice.)
It would be a poor world with only five or six people legitimately programming operating systems, embedded devices and drivers (although it sometimes appears as if the latter is the case...).
This is undefined behavior in C. You're accessing a random memory address which, depending on the platform, may or may not be on the stack. It may or may not be in memory this user can access; if not you will get a segfault or similar. There are absolutely no promises either way.
Don't Do That. (If you're one of the five or six people who has a legitimate reason to do this, you already know it and don't need our advice.)

What happens in OS when we dereference a NULL pointer in C?

Let's say there is a pointer and we initialize it with NULL.
int* ptr = NULL;
*ptr = 10;
Now , the program will crash since ptr isn't pointing to any address and we're assigning a value to that , which is an invalid access. So , the question is , what happens internally in the OS ? Does a page-fault / segmentation-fault occur ? Will the kernel even search in the page table ? Or the crash occur before that?
I know I wouldn't do such a thing in any program but this is just to know what happens internally in the OS or Compiler in such a case. And it is NOT a duplicate question.
Short answer: it depends on a lot of factors, including the compiler, processor architecture, specific processor model, and the OS, among others.
Long answer (x86 and x86-64): Let's go down to the lowest level: the CPU. On x86 and x86-64, that code will typically compile into an instruction or instruction sequence like this:
movl $10, 0x00000000
Which says to "store the constant integer 10 at virtual memory address 0". The IntelĀ® 64 and IA-32 Architectures Software Developer Manuals describe in detail what happens when this instruction gets executed, so I'm going to summarize it for you.
The CPU can operate in several different modes, several of which are for backwards compatibility with much older CPUs. Modern operating systems run user-level code in a mode called protected mode, which uses paging to convert virtual addresses into physical addresses.
For each process, the OS keeps a page table which dictates how the addresses are mapped. The page table is stored in memory in a specific format (and protected so that they can not be modified by the user code) that the CPU understands. For every memory access that happens, the CPU translates it according to the page table. If the translation succeeds, it performs the corresponding read/write to the physical memory location.
The interesting things happen when the address translation fails. Not all addresses are valid, and if any memory access generates an invalid address, the processor raises a page fault exception. This triggers a transition from user mode (aka current privilege level (CPL) 3 on x86/x86-64) into kernel mode (aka CPL 0) to a specific location in the kernel's code, as defined by the interrupt descriptor table (IDT).
The kernel regains control and, based on the information from the exception and the process's page table, figures out what happened. In this case, it realizes that the user-level process accessed an invalid memory location, and then it reacts accordingly. On Windows, it will invoke structured exception handling to allow the user code to handle the exception. On POSIX systems, the OS will deliver a SIGSEGV signal to the process.
In other cases, the OS will handle the page fault internally and restart the process from its current location as if nothing happened. For example, guard pages are placed at the bottom of the stack to allow the stack to grow on demand up to a limit, instead of preallocating a large amount of memory for the stack. Similar mechanisms are used for achieving copy-on-write memory.
In modern OSes, the page tables are usually set up to make the address 0 an invalid virtual address. But sometimes it's possible to change that, e.g. on Linux by writing 0 to the pseudofile /proc/sys/vm/mmap_min_addr, after which it's possible to use mmap(2) to map the virtual address 0. In that case, dereferencing a null pointer would not cause a page fault.
The above discussion is all about what happens when the original code is running in user space. But this could also happen inside the kernel. The kernel can (and is certainly much more likely than user code to) map the virtual address 0, so such a memory access would be normal. But if it's not mapped, then what happens then is largely similar: the CPU raises a page fault error which traps into a predefined point at the kernel, the kernel examines what happened, and reacts accordingly. If the kernel can't recover from the exception, it will typically panic in some fashion (kernel panic, kernel oops, or a BSOD on Windows, e.g.) by printing out some debug information to the console or serial port and then halting.
See also Much ado about NULL: Exploiting a kernel NULL dereference for an example of how an attacker could exploit a null pointer dereference bug from inside the kernel in order to gain root privileges on a Linux machine.
As a side note, just to compel the differences in architectures, a certain OS developed and maintained by a company known for their three-letter acronym name and often referred to as a large primary color has a most-fasicnating NULL determination.
They utilize a 128-bit linear address space for ALL data (memory AND disk) in one giant "thing". In accordance with their OS, a "valid" pointer must be placed on a 128-bit boundary within that address space. This, btw, causes fascinating side effects for structs, packed or not, that house pointers. Anyway, tucked away in a per-process dedicated page is a bitmap that assigns one bit for every valid location in a process address space where a valid pointer can lay. ALL opcodes on their hardware and OS that can generate and return a valid memory address and assign it to a pointer will set the bit that represents the memory address where that pointer (the target pointer) is located.
So why should anyone care? For this simple reason:
int a = 0;
int *p = &a;
int *q = p-1;
if (p)
{
// p is valid, p's bit is lit, this code will run.
}
if (q)
{
// the address stored in q is not valid. q's bit is not lit. this will NOT run.
}
What is truly interesting is this.
if (p == NULL)
{
// p is valid. this will NOT run.
}
if (q == NULL)
{
// q is not valid, and therefore treated as NULL, this WILL run.
}
if (!p)
{
// same as before. p is valid, therefore this won't run
}
if (!q)
{
// same as before, q is NOT valid, therefore this WILL run.
}
Its something you have to see to believe. I can't even imagine the housekeeping done to maintain that bit map, especially when copying pointer values or freeing dynamic memory.
On CPU which support virtual mermory, a page fault exception will be usually issued if you try to read at memory address 0x0. The OS page fault handler will be invoked, the OS will then decide that the page is invalid and aborts your program.
Note that on some CPU you can also safely access memory address 0x0.
As the C Standard says dereferencing a null pointer is undefined, if the compiler is able to detect at compile time (or even runtime) that your are dereferencing a null pointer it can do whatever it wants, like aborting the program with a verbose error message.
(C99, 6.5.3.2.p4) "If an invalid value has been assigned to the pointer, the behavior of the unary * operator is undefined.87)"
87): "Among the invalid values for dereferencing a pointer by the unary * operator are a null pointer, an address inappropriately aligned for the type of object pointed to, and the address of an object after the end of its lifetime."
In a typical case, int *ptr = NULL; will set ptr to point to address 0. The C standard (and the C++ standard) is very careful to not require that, but it's extremely common nonetheless.
When you do *ptr = 10;, the CPU would normally generate 0 on the address lines, and 10 on the data lines, while setting a R/W line to indicate a write (and, if the bus has such a thing, assert the memory vs. I/O line to indicate a write to memory, not I/O).
Assuming the CPU supports memory protection (and you're using an OS that enables it), the CPU will check that (attempted) access before it happens though. For example, a modern Intel/AMD CPU will use paging tables that map virtual addresses to physical addresses. In a typical case, address 0 won't be mapped to any physical address. In this case, the CPU will generate an access violation exception. For one fairly typical example, Microsoft Windows leaves the first 4 megabytes un-mapped, so any address in that range will normally result in an access violation.
On an older CPU (or an older operating system that doesn't enable the CPUs protection features) the attempted write will often succeed. For example, under MS-DOS, writing through a NULL pointer would simply write to address zero. In small or medium model (with 16-bit addresses for data) most compilers would write some known pattern to the first few bytes of the data segment, and when the program ended, they'd check to see if that pattern remained intact (and do something to indicate that you'd written via a NULL pointer if it failed). In compact or large model (20-bit data addresses) they'd generally just write to address zero without warning.
I imagine that this is platform and compiler dependent. The NULL pointer could be implemented by using a NULL page, in which case you'd have a page fault, or it could be below the segment limit for an expand-down segment, in which case you'd have a segmentation fault.
This is not a definitive answer, just my conjecture.

assigning a value to the address

I tried the below program to make the pointer to point to a particular address and to store a value in that address.When i make the pointer to contain the value for the assigned address i'm getting a run time error asking me to close the program.
Is it not possible to assign a value to the address 0x6778.why is it so? In what situations does this needed? Please help me understand.
int *p=(int*)0x6778;
printf("The address is:%x",p);
When tried to do *p=1000 i am getting the error.
There are many reasons why this could give you an error:
The address 0x6778 might not be part of this process's virtual memory -- it might not really "exist". You could read more about virtual memory, but basically addresses don't refer directly to physical bytes -- they have to be translated in a table, and that table might not have an entry for your address.
If it is mapped, it might be on a read-only page
If it's mapped and writable, it might corrupt some other part of your program, causing a segfault soon after.
In general, you probably can't write to an arbitrary address in a user-level application. Of course, if you're running a kernel or embedded system, ignore this answer, as it totally does not apply ;-)
That address is likely not in your process's address space, so your program receives an exception from the operating system when you try to access it. You shouldn't be trying to use specific memory locations to store things... rather, use malloc for dynamic allocation, or put things on the stack.
int *p=(int*)0x6778;
To do this, the address location 0x6778 should be a valid address location in first place.
An Address space gets allocated to every process, Your program runs in an particular process, If an program tries to access an address location beyond its address space then it will crash. Seems that is happening in your case.
Unless, you are sure that an virtual address location is valid for use by your program DO NOT access the address locations explicitly, let the compiler put the types in address space allocated to your process and return it back to you. To do that, the simplest way is to just make use of local variables with automatic storage or use malloc for dynamic allocations.

Resources