I am familiar with BIOS int 15 - E820 function, where you could choose a fixed physical location, put there whatever you wanted, the OS would not overwrite it, and you could just access that fixed memory address (may map it to a virtual pointer first etc).
But in the UEFI case, as much as I am aware, there is no memory area reserved for the user, so I couldn't rely on allocating a buffer at a specific memory address (if that's even possible?), therefore I have to use a UEFI memory memory function - which returns a pointer that is not fixed.
So my questions are -
Is it possible to allocate a buffer that will not be overwritten once the OS goes up?
How is it possible to pass the OS the pointer of the allocated buffer, so I could access it from the OS (again, since allocation, hopefully given that the buffer itself is not overwritten, is not in a fixed location).
Thank you!
Yes. Allocate memory memory of a non-reclaimable type, such as EfiRuntimeServicesData.
The mechanism UEFI uses is called configuration tables.
Note: EfiPersistentMemory is something completely different.
Configuration tables are installed by calling InstallConfigurationTable during boot services, with the two parameters being a GUID and a pointer to the physical address of the data structure you want to pass. This pair is then linked into an array pointed to by the UEFI System Table.
How you extract that information in Windows, I do not know. In Linux, the UEFI system table is globally accessible in kernel space (efi->systab), so the pointer can be extracted from there.
Related
On Python, using ctypes if applicable, how can I return the value a memory address is pointing to?
For instance- when I boot up my x86 PC, let's say the address 0xfffff800 points to a memory address of 0xffffffff
Using Python, how do I extract the value 0xfffff800 is pointing to (0xffffffff) and save it into a variable? Is this even possible? I have tried using id but I believe that is only used for local instances (if I created a variable, assigned it a value, and returned that value via id)
Thanks
From your reference to memory contents at boot time, and the idea implicit in the question that there is only one value at any given address, I take you to be talking about physical memory. In that case, no, what you ask is not possible. Only the operating system kernel (or some other program running directly on the hardware) can access physical memory in any system that presently supports Python, and to the best of my knowledge, there is no Python implementation that runs on bare metal.
Instead, the operating system affords each running process its own virtual memory space in which to run, whose contents at any given time might reside more or less anywhere in physical memory or on swap devices (disk files and / or partitions). The system takes great care to isolate processes from each other and from the underlying physical storage, so no, Python cannot access it.
Moreover, processes running on an OS cannot generally access arbitrary virtual addresses, either, regardless of the programming language in which they are written. They can access only those portions of their address spaces that the OS has mapped for them. ctypes therefore does not help you.
simple question:
Is it possible, and how is it possible, to acess the Virtual Memory of my program directly?
To be specific,
instead of typing
int someValue = 5;
can I do something like this:
VirtualMemory[0x0] = (int)5;
I'm just asking because I want the values to be stored next to each other to get a nice and small memory map.
When I look into assembler basics, the processor stores values directly after each other and I was wondering how to do so in c.
Thanks for all of your replies.
Cheers,
Lucky
Not exactly, because in the source code you don't know which memory address your program is going to be "loaded into". So all memory addresses in the program are encoded in an "offset from the start of program" type manner.
Part of the "process loader"'s responsibility in copying the program into memory is to add the "base offset pointer" to all the other offesets, so all the "names" describing memory addresses refer to actual memory addresses instead of "offsets from the beginning of the program".
That's generally a good thing, as if they were encoded directly, two programs that needed the same set of addresses couldn't be run at the same time without corrupting each other's shared memory. In addition, loading a program into a different starting address would not be possible, as walking outside of the memory of your program (nearly guaranteed if you relocate the program without rewriting the memory address references) is going to raise a segfault in the operating system's memory management monitors.
Also you need a name to start at, and this means that the offsets are bound to the variable names. Generally it is much easier to do fishing around in the heap based off of an alloc'd item than it is to truly find the start of the program loaded in memory (because the C programming language doesn't really capture that address into a in-language variable name, and the layout is somewhat system dependent).
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I am very new to these concepts but I want to ask you all a question that is very basic I think, but I am confused, So I am asking it.
The question is...
How is the size of a process determined by the OS?
Let me clear it first, suppose that I have written a C program and I want to know that how much memory it is going to take, how can I determine it? secondly I know that there are many sections like code section, data section, BSS of a process. Now does the size of these are predetermined? secondly how the size of Stack and heap are determined. does the size of stack and heap also matters while the Total size of process is calculated.
Again we say that when we load the program , an address space is given to the process ( that is done by base and limit register and controlled by MMU, I guess) and when the process tries to access a memory location that is not in its address space we get segmentation fault. How is it possible for a process to access a memory that is not in its address space. According to my understanding when some buffer overflows happens then the address gets corrupted. Now when the process wants to access the corrupted location then we get the segmentation fault. Is there any other way of Address violation.
and thirdly why the stack grows downward and heap upwards.Is this process is same with all the OS. How does it affects the performance.why can't we have it in other way?
Please correct me, if I am wrong in any of the statement.
Thanks
Sohrab
When a process is started it gets his own virtual address space. The size of the virtual address space depends on your operating system. In general 32bit processes get 4 GiB (4 giga binary) addresses and 64bit processes get 18 EiB (18 exa binary) addresses.
You cannot in any way access anything that is not mapped into your virtual address space as by definition anything that is not mapped there does not have an address for you. You may try to access areas of your virtual address space that are currently not mapped to anything, in which case you get a segfault exception.
Not all of the address space is mapped to something at any given time. Also not all of it may be mapped at all (how much of it may be mapped depends on the processor and the operating system). On current generation intel processors up to 256 TiB of your address space may be mapped. Note that operating systems can limit that further. For example for 32 bit processes (having up to 4 GiB addresses) Windows by default reserves 2 GiB for the system and 2 GiB for the application (but there's a way to make it 1 GiB for the system and 3 GiB for the application).
How much of the address space is being used and how much is mapped changes while the application runs. Operating system specific tools will let you monitor what the currently allocated memory and virtual address space is for an application that is running.
Code section, data section, BSS etc. are terms that refer to different areas of the executable file created by the linker. In general code is separate from static immutable data which is separate from statically allocated but mutable data. Stack and heap are separate from all of the above. Their size is computed by the compiler and the linker. Note that each binary file has his own sections, so any dynamically linked libraries will be mapped in the address space separately each with it's own sections mapped somewhere. Heap and stack, however, are not part of the binary image, there generally is just one stack per process and one heap.
The size of the stack (at least the initial stack) is generally fixed. Compilers and/or linkers generally have some flags you can use to set the size of the stack that you want at runtime. Stacks generally "grow backward" because that's how the processor stack instructions work. Having stacks grow in one direction and the rest grow in the other makes it easier to organize memory in situations where you want both to be unbounded but do not know how much each can grow.
Heap, in general, refers to anything that is not pre-allocated when the process starts. At the lowest level there are several logical operations that relate to heap management (not all are implemented as I describe here in all operating systems).
While the address space is fixed, some OSs keep track of which parts of it are currently reclaimed by the process. Even if this is not the case, the process itself needs to keep track of it. So the lowest level operation is to actually decide that a certain region of the address space is going to be used.
The second low level operation is to instruct the OS to map that region to something. This in general can be
some memory that is not swappable
memory that is swappable and mapped to the system swap file
memory that is swappable and mapped to some other file
memory that is swappable and mapped to some other file in read only mode
the same mapping that another virtual address region is mapped to
the same mapping that another virtual address region is mapped to, but in read only mode
the same mapping that another virtual address region is mapped to, but in copy on write mode with the copied data mapped to the default swap file
There may be other combinations I forgot, but those are the main ones.
Of course the total space used really depends on how you define it. RAM currently used is different than address space currently mapped. But as I wrote above, operating system dependent tools should let you find out what is currently happening.
The sections are predetermined by the executable file.
Besides that one, there may be those of any dynamically linked libraries. While the code and constant data of a DLL is supposed to be shared across multiple processes using it and not be counted more than once, its process-specific non-constant data should be accounted for in every process.
Besides, there can be dynamically allocated memory in the process.
Further, if there are multiple threads in the process, each of them will have its own stack.
What's more, there are going to be per-thread, per-process and per-library data structures in the process itself and in the kernel on its behalf (thread-local storage, command line params, handles to various resources, structures for those resources as well and so on and so forth).
It's difficult to calculate the full process size exactly without knowing how everything is implemented. You might get a reasonable estimate, though.
W.r.t. According to my understanding when some buffer overflows happens then the address gets corrupted. It's not necessarily true. First of all, the address of what? It depends on what happens to be in the memory near the buffer. If there's an address, it can get overwritten during a buffer overflow. But if there's another buffer nearby that contains a picture of you, the pixels of the picture can get overwritten.
You can get segmentation or page faults when trying to access memory for which you don't have necessary permissions (e.g. the kernel portion that's mapped or otherwise present in the process address space). Or it can be a read-only location. Or the location can have no mapping to the physical memory.
It's hard to tell how the location and layout of the stack and heap are going to affect performance without knowing the performance of what we're talking about. You can speculate, but the speculations can turn out to be wrong.
Btw, you should really consider asking separate questions on SO for separate issues.
"How is it possible for a process to access a memory that is not in its address space?"
Given memory protection it's impossible. But it might be attempted. Consider random pointers or access beyond buffers. If you increment any pointer long enough, it almost certainly wanders into an unmapped address range. Simple example:
char *p = "some string";
while (*p++ != 256) /* Always true. Keeps incrementing p until segfault. */
;
Simple errors like this are not unheard of, to make an understatement.
I can answer to questions #2 and #3.
Answer #2
When in C you use pointers you are really using a numerical value that is interpreted as address to memory (logical address on modern OS, see footnotes). You can modify this address at your will. If the value points to an address that is not in your address space you have your segmentation fault.
Consider for instance this scenario: your OS gives to your process the address range from 0x01000 to 0x09000. Then
int * ptr = 0x01000;
printf("%d", ptr[0]); // * prints 4 bytes (sizeof(int) bytes) of your address space
int * ptr = 0x09100;
printf("%d", ptr[0]); // * You are accessing out of your space: segfault
Mostly the causes of segfault, as you pointed out, are the use of pointers to NULL (that is mostly 0x00 address, but implementation dependent) or the use of corrupted addresses.
Note that, on linux i386, base and limit register are not used as you may think. They are not per-process limits but they point to two kind of segments: user space or kernel space.
Answer #3
The stack growth is hardware dependent and not OS dependent. On i386 assembly instruction like push and pop make the stack grow downwards with regard to stack related registers. For instance the stack pointer automatically decreases when you do a push, and increases when you do a pop. OS cannot deal with it.
Footnotes
In a modern OS, a process uses the so called logic address. This address is mapped with physical address by the OS. To have a note of this compile yourself this simply program:
#include <stdio.h>
int main()
{
int a = 10;
printf("%p\n", &a);
return 0;
}
If you run this program multiple times (even simultaneously) you would see, even for different instances, the same address printed out. Of course this is not the real memory address, but it is a logical address that will be mapped to physical address when needed.
This query is regarding allocation of memory using malloc.
Generally what we say is malloc allocates memory from heap.
Now say I have a plain embedded system(No operating system), I have normal program loaded where I do malloc in my program.
In this case where is the memory allocated from ?
malloc() is a function that is usually implemented by the runtime-library. You are right, if you are running on top of an operating system, then malloc will sometimes (but not every time) trigger a system-call that makes the OS map some memory into your program's address space.
If your program runs without an operating system, then you can think of your program as being the operating system. You have access to all addresses, meaning you can just assign an address to a pointer, then de-reference that pointer to read/write.
Of course you have to make sure that not other parts of your program just use the same memory, so you write your own memory-manager:
To put it simply you can set-aside a range of addresses which your "memory-manager" uses to store which address-ranges are already in use (the datastructures stored in there can be as easy as a linked list or much much more complex). Then you will write a function and call it e.g. malloc() which forms the functional part of your memory-manager. It looks into the mentioned datastructure to find an address of ranges that is as long as the argument specifies and return a pointer to it.
Now, if every function in your program calls your malloc() instead of randomly writing into custom addresses you've done the first step. You can write a free()-function which will look for the pointer it is given in the mentioned datastructure, and adapts the datastructure (in the naive linked-list it would merge two links).
The only real answer is "Wherever your compiler/library-implementation puts it".
In the embedded system I use, there is no heap, since we haven't written one.
From the heap as you say. The difference is that the heap is not provided by the OS. Your application's linker script will no doubt include an allocation for the heap. The run-time library will manage this.
In the case of the Newlib C library often used in GCC based embedded systems not running an OS or at least not running Linux, the library has a stub syscall function called sbrk(). It is the respnsibility of the developer to implement sbrk(), which must provide more memory the the heap manager on request. Typically it merely increments a pointer and returns a pointer to the start of the new block, thereafter the library's heap manager manages and maintains the new block which may or may not be contiguous with previous blocks. The previous link includes an example implementation.
Where do malloc() and free() store the allocated addresses and their sizes (Linux GCC)? I've read that some implementations store them somewhere before the actual allocated memory, but I could not confirm that in my tests.
The background, maybe someone has another tip for this:
I'm experimenting a little bit with analyzing the heap memory of a process in order to determine the current value of a string in the other process.
Accessing the process heap memory and strolling through it is no problem. However, because the value of the string changes and the process allocates a new part of the memory each time, the string's address changes. Because the string has a fixed format it's still easy to find, but after a few changes the old versions of the string are still in the heap memory (probably freed, but still not reused / overwritten) and thus I'm not able to tell which string is the current one.
So, in order to still find the current one I want to check if a string I find in the memory is still used by comparing its address against the addresses malloc() and free() know about.
ciao,
Elmar
There are lots of ways in which malloc/free can store the size of the memory area. For example, it might be stored just before the area returned by malloc. Or it might be stored in a lookup table elsewhere. Or it might be stored implicitly: some areas might be reserved for specific sizes of allocations.
To find out how the C library in Linux (glibc) does this, get the source code from http://ftp.gnu.org/gnu/glibc/ and look at the malloc/malloc.c file. There is some documentation at the top, and it refers to A Memory Allocator by Doug Lea.
This is up to the implementation of the standard library, of course. So your best bet would probably be to dig through the source of the library (glibc is the default on Linux) and see if you can figure it out. It is probably not going to be trivial.