Strange memory allocation code in C, how it works? - c

How does this code work???
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
int *addr = (int*) 0x4888d0;
*addr = 30;
printf("%i %p\n", *addr, addr);
return 0;
}

It works by assuming 0x4888d0 is a the address of a writable block of memory of at least sizeof(int) bytes that does not interfere with the functionality of printf or the C runtime system.
Or rather, it doesn't work, at least not on my system (Segmentation fault).

There is nothing strange in it, however, it seems quite dangerous. What this program is trying to do is to write 30 at a specific location. i.e., the Location whose address is contained in 0x4888d0.
Why this code is written like this and why this particular address, well this is anybody's guess.

int *addr = (int*) 0x4888d0; will give addr an address =0x4888d0. This address might be a valid address. But there is no guarantee that it will always work.

As stated above this tends to segment value, or corrupt, normal applications. However,
Modern computers tend to have reserved memory addresses that do magic things, like control I/O, set CPU modes, update memory maps, etc. Memory pages with such are "real" addresses not mapped into the virtual memory regular applications get. Such is where the kernel communicates with hardware controllers. The fact that the provided fragment pokes a memory location and then promptly reads it back is typical of asking a controller for some type of status and then getting the status back (any write to a magic word can update the status the controller makes available to the software... the value may not be important).
So, if this code is from kernel space, or if this is in some micro-controller, or other strange system, the magic memory addresses could be available. Another possibility is that a privileged application has requested special virtual memory mapping from the kernel that can also expose magic pages to it. This can get weird as while the application requested something be mapped to virtual memory location including 0x4888d0, the real memory page could be quite different (and unavailable to the application).

Related

Why are virtual addresses so big in my C program?

I recently learned about virtual memory and paging and that compilers only generate virtual addresses starting by 1 and simply counting upwards. I thought I'd test this and wrote the short C progam below that instantiates a global variable and prints it's address, expecting a very small value, since the CPU only sees the virtual addresses, but instead I get 4247584. What is going on here, are my assumptions wrong? And if possible, what would be a program that shows virtual addresses being generated from 1 up?
My program:
#include <stdio.h>
int x = 0;
int main(){
printf("%d\n", &x);
return 0;
}
(I'm using gcc 4.8.1 on Windows 10)
The actual value of a virtual address is relatively non-essential (well, because it's virtual). There's nothing "wasted" when it doesn't start at 0. The only precondition for address values is that the program, data and all its associated shared libraries actually fit into the value-space.
For security reasons, however, it makes sense to allocate the various code and data areas of a process in virtual address space in a way non-reproducible by a potential attacker (makes code injection attacks at fixed addresses virtually[sic] impossible), that is why modern operating systems allocate virtual address space values for a program randomly.
On some operating systems like Linux you may be able to switch off virtual address space layout randomization and thus make it reproducible. Addresses will most probably still not start at zero, because libraries and startup code will most likely occupy addresses lower than your own program.

How to print the value in a given memory address without segment fault?

I found an interesting phenomenon when I execute a simple test code:
int main(){
int *p=(int *)0x12f930;
printf("%d",*p);
return 0;
}
Of course it crashed with a segmentation fault. but even I change the 0x12f930 to 0x08048001(0x08048000+1, that should be the text area when execute the elf binary), it still crashed with a SF.
then I changed my code as below:
int main()
{
int i=1;
printf("%x",&i);
return 0;
}
the output is 0xf3ee8f0c, but as I know, the address of user space should be <=0xc0000000, so I am quite confused.
Anyone can help?
First, don't ever do it, unless there's a specific need to.
But, certain embedded applications and legacy systems, might need the explicit memory access.So, here's and example code:
const unsigned addr = 0xdeadbeee;//This address is an example, which should always be >0xc000000 and const
const unsigned *ptr=(const unsigned*)addr;//Then you can assign it to a pointer after proper casting and keeping it const, unless there's a need to keep it not-const
Be careful, as you may hit an unallocated memory or worse thrash the memory and even cause system instability. Also, the above code is implementation defined and as such not portable among different systems.
If you are executing your program in that OS, you need to understand the memory addressing scheme, followed by OS.Specially, some OS assign random starting address of the stack and/or heap in order to make some difficult to attack memory/processes in the system.So, every time you will execute the program, that processes address will be different.
If you wish to examine a process's memory, you could refer to source of GDB and how they do it.

Accessing memory below the stack on linux

This program accesses memory below the stack.
I would assume to get a segfault or just nuls when going out of stack bounds but I see actual data. (This is assuming 100kb below stack pointer is beyond the stack bounds)
Or is the system actually letting me see memory below the stack? Weren't there supposed to be kernel level protections against this, or does that only apply to allocated memory?
Edit: With 1024*127 below char pointer it randomly segfaults or runs, so the stack doesn't seem to be a fixed 8MB, and there seems to be a bit of random to it too.
#include <stdio.h>
int main(){
char * x;
int a;
for( x = (char *)&x-1024*127; x<(char *)(&x+1); x++){
a = *x & 0xFF;
printf("%p = 0x%02x\n",x,a);
}
}
Edit: Another wierd thing. The first program segfaults at only 1024*127 but if I printf downwards away from the stack I don't get a segfault and all the memory seems to be empty (All 0x00):
#include <stdio.h>
int main(){
char * x;
int a;
for( x = (char *)(&x); x>(char *)&x-1024*1024; x--){
a = *x & 0xFF;
printf("%p = 0x%02x\n",x,a);
}
}
When you access memory, you're accessing the process address space.
The process address space is divided into pages (typically 4 KB on x86). These are virtual pages: their contents are held elsewhere. The kernel manages a mapping from virtual pages to their contents. Contents can be provided by:
A physical page, for pages that are currently backed by physical RAM. Accesses to these happen directly (via the memory management hardware).
A page that's been swapped out to disk. Accessing this will cause a page fault, which the kernel handles. It needs to fill a physical page with the on-disk contents, so it finds a free physical page (perhaps swapping that page's contents out to disk), reads in the contents from disk, and updates the mapping to state that "virtual page X is in physical page Y".
A file (i.e. a memory mapped file).
Hardware devices (i.e. hardware device registers). These don't usually concern us in user space.
Suppose that we have a 4 GB virtual address space, split into 4 KB pages, giving us 1048576 virtual pages. Some of these will be mapped by the kernel; others will not. When the process starts (i.e. when main() is invoked), the virtual address space will contain, amongst other things:
Program code. These pages are usually readable and executable.
Program data (i.e. for initialised variables). This usually has some read-only pages and some read-write pages.
Code and data from libraries that the program depends on.
Some pages for the stack.
These things are all mapped as pages in the 4 GB address space. You can see what's mapped by looking at /proc/(pid)/maps, as one of the comments has pointed out. The precise contents and location of these pages depend on (a) the program in question, and (b) address space layout randomisation (ASLR), which makes locations of things harder to guess, thereby making certain security exploitation techniques more difficult.
You can access any particular location in memory by defining a pointer and dereferencing it:
*(unsigned char *)0x12345678
If this happens to point to a mapped page, and that page is readable, then the access will succeed and yield whatever's mapped at that address. If not, then you'll receive a SIGSEGV from the kernel. You could handle that (which is useful in some cases, such as JIT compilers), but normally you don't, and the process will be terminated. As noted above, due to ASLR, if you do this in a program and run the program several times then you'll get non-deterministic results for some addresses.
There is usually quite a bit of accessible memory below the stack pointer, because that memory is used when you grow the stack normally. The stack itself is only controlled by the value of the stack pointer - it is a software entity, not a hardware entity.
However, system code may assume typical stack usage. I. e., on some systems, the stack is used to store state for a context switch, while a signal handler runs, etc. This also depends on whether the hardware automatically switches stack pointers when leaving user mode. If the system does use your stack for this, it will clobber the data you stored there, and that can really happen at every point in your program.
So it is not safe to manipulate stack memory below the stack pointer. It's not even safe to assume that a value that has successfully been written will still be the same in the next line code. Only the portion above the stack pointer is guaranteed not to be touched by the runtime/kernel.
It goes without saying, that this code invokes undefined behavior. The pointer arithmetic is invalid, because the address &x-1024*127 is not allocated to the variable x, so that dereferencing this pointer invokes undefined behavior.
This is undefined behavior in C. You're accessing a random memory address which, depending on the platform, may or may not be on the stack. It may or may not be in memory this user can access; if not you will get a segfault or similar. There are absolutely no promises either way.
Actually, it's not undefined behaviour, it's pretty well defined. Accessing memory locations through pointers is and was always defined since C is as close to the hardware as it can be.
I however agree that accessing hardware through pointers when you don't know exactly what you're doing is a dangerous thing to do.
Don't Do That. (If you're one of the five or six people who has a legitimate reason to do this, you already know it and don't need our advice.)
It would be a poor world with only five or six people legitimately programming operating systems, embedded devices and drivers (although it sometimes appears as if the latter is the case...).
This is undefined behavior in C. You're accessing a random memory address which, depending on the platform, may or may not be on the stack. It may or may not be in memory this user can access; if not you will get a segfault or similar. There are absolutely no promises either way.
Don't Do That. (If you're one of the five or six people who has a legitimate reason to do this, you already know it and don't need our advice.)

Access to a specific memory address

I'm a new C programmer, still learning the language itself.
Anyway -
I'm trying to access a specific memory address.
I've written this code:
#include <stdio.h>
int main()
{
int* p = (int*) 0x4e0f68;
*p = 12;
getchar();
}
When I try to access a specific memory address like that, the program crashes.
I don't know if this information is relevant, but I'm using Windows 7 and Linux Ubuntu.
(I've tried this code only on Windows 7).
Any explanations why the program crashes?
How can I access a specific memory address (an address which is known at compile-time, I don't mean to dynamic memory allocation)?
Thanks.
That's memory you don't own and accessing it is undefined behavior. Anything can happen, including crashing.
On most systems, you'd be able to inspect the memory (although technically still undefined behavior), but writing to it is a whole different story.
Strictly speaking you cannot create a valid pointer like this. Valid pointers must point to valid objects (either on your stack or obtained from malloc).
For most modern operating systems you have a virtual memory space that only your process can see. As you request more memory from the system (malloc, VirtualAlloc, mmap, etc) this virtual memory is mapped into real usable memory that you can safely read and write to. So you can't just take an arbitrary address and try to use it without OS cooperation.
An example for windows:
#include <windows.h>
#include <stdio.h>
int main(void)
{
SYSTEM_INFO sysinfo;
GetSystemInfo(&sysinfo);
unsigned pageSize = sysinfo.dwPageSize;
printf("page size: %d\n", pageSize);
void* target = (void*)0x4e0f68;
printf("trying to allocate exactly one page containing 0x%p...\n", target);
void* ptr = VirtualAlloc(target, pageSize, MEM_COMMIT, PAGE_READWRITE);
if (ptr)
printf("got: 0x%p\n", ptr); // ptr <= target < ptr+pageSize
else
printf("failed! OS wont let us use that address.\n");
return 0;
}
Note that this will give you different results on different runs. Try it more than once.
Just to clrify one phrase the OP wrote: strictly speaking, no address associated to a program (code or data) is known at compile time. Programs usually are loaded at whatever address the OS determines. The final address a program sees (for example, to read a global variable) is patched by the OS in the very program code, using some sort of relocation table. DLL functions called by a program have a similar mechanism, where the IDATA section of the executable is converted into a jump table to jump to the actual address of a function in a DLL, taking the actual addresses from the DLL in memory.
That said, it is indeed possible to know by advance where a variable will be placed, if the program is linked with no relocation information. This is possible in Windows, where you can tell the linker to load the program to an absolute virtual address. The OS loader will try to load your program to that address, if possible.
However, this feature is not recommended because it can lead to easily exploiting possible security holes. If an attacker discovers a security hole in a program and try to inject code into it, it will be easier for him if the program has all its variables and functions in specific addresses, so the malicious code will know where to make patches to gain control of that program.
What you're getting is a segfault - when you're trying to access memory you don't have permission to access. Pointers, at least for userspace, must point to some variable, object, function, etc. You can set a pointer to a variable with the & operator - int* somePtr = &variableToPointTo, or to another pointer - int* someNewPtr = somePtr. In kernel mode (ring 0) or for OS development, you can do that, BUT IT IS NOT ADVISED TO DO SO. In MS-DOS, you could destroy your machine because there was no protection against that.

Declare a pointer to an integer at address 0x200 in memory

I have a couple of doubts, I remember some where that it is not possible for me to manually put a variable in a particular location in memory, but then I came across this code
#include<stdio.h>
void main()
{
int *x;
x=0x200;
printf("Number is %lu",x); // Checkpoint1
scanf("%d",x);
printf("%d",*x);
}
Is it that we can not put it in a particular location, or we should not put it in a particular location since we will not know if it's a valid location or not?
Also, in this code, till the first checkopoint, I get output to be 512.
And then after that Seg Fault.
Can someone explain why? Is 0x200 not a valid memory location?
In the general case - the behavior you will get is undefined - everything can happen.
In linux for example, the first 1GB is reserved for kernel, so if you try to access it - you will get a seg fault because you are trying to access a kernel memory in user mode.
No idea how it works in windows.
Reference for linux claim:
Currently the 32 bit x86 architecture is the most popular type of
computer. In this architecture, traditionally the Linux kernel has
split the 4GB of virtual memory address space into 3GB for user
programs and 1GB for the kernel.
Adding to what #amit wrote:
In windows it is the same. In general it is the same for all protected-mode operating systems. Since DOS etc. are no longer around it is the same with all systems except kernel-mode (km-drivers) and embedded systems.
The operating system manages which memory-pages you are allowed to write to and places markers that will make the cpu automatically raise access-violations if some other page is written to.
Up until the "checkpoint", you haven't accessed memory location 0x200, so everything works fine.
There I'd a local variable x in the function main. It is of type "pointer to int". x is assigned the value 0x200, and then that value is printed. But the target of x hasn't been accessed, so up to this point it doesn't matter whether x holds a valid memory address or not.
Then scanf tries to write to the memory address you passed in, which is the 0x200 stored in x. Then you get a seg fault, which is certainly sac possible result of trying to write to an arbitrary memory address.
So what are your doubts? What makes you think that this might work, when you come across this code that clearly doesn't?
Writing to a particular memory address might work under certain conditions, but is extremely unlikely to in general. Under all modern OSes, normal programs do not have control over their memory layout. The OS decides where initial things like the program's code, stack, and globals go. The OS will probably also be using some memory space, and it is not required to tell you what it's using. Instead you ask for memory (either by making variables or by calling memory allocation routines), and you use that.
So writing to particular addresses is very very likely to get either memory that hasn't been allocated, or memory that is being used for some other purpose. Neither of those is good, even if you do manage to hit an address that is actually writable. What if you clobber sundry some piece of data used by one of your program's other variables? Or some other part of your program clobbers the value you just wrote?
You should never be choosing a particular hard-coded memory address, you should be using an address of something you know is a variable, or an address you got from something like malloc.

Resources