Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
I am confused with the segments in RAM memory,please clarify following doubts
RAM has been been dived as User space and Kernel space;is this memory division is done by O/S or it is done by H/W(CPU).
What are the contents of kernel space;as far as i have understood there will be kernel image only,please correct me if i am wrong.
Where does this code,data,stack and heap segments exist?
a) Does User and Kernel space has separate code,data,stack and heap segments?
b) Is this segments are created by H/W or (O/S).
Can i find the amount of memory occupied by Kernel space and User Space?
a) Is there any Linux command (or) system calls to find this?
Why the RAM has been divided into user space and kernel space?
a) I fell it is done to keep the kernel safe from application program is it so?is this is the only reason.
I am a beginner so please suggest me some good books,links and the way to approach these concepts.
I took up the challenge and tried with rather short answers:
Execution happens in user and kernel space. BIOS & CPU support the OS at detecting and separating resources/address ranges such as main memory and devices (-> related question) to establish the protected mode. In protected mode, memory is separated via virtual address spaces, which are mapped page wise (usually blocks of 4096 byte) to real addresses of physical memory via the MMU (Memory Management Unit).
From user space, one cannot accesses memory directly (in real mode), one has to access it via the MMU, which acts like a transparent proxy with access protection. Access errors are known as segmentation fault, access violation, segmentation violation (SIGSEGV), which are abstracted with NullPointerException (NPE) in high level programming languages like Java.
Read about protected mode, real mode and 'rings'.
Note: Special CPUs, such as in embedded systems, don't necessarily have an MMU and could therefore be limited to special OSes like µClinux or FreeRTOS.
A kernel does also allocate buffers, the same goes for drivers (e.g. IO buffers for disks, network interfaces and GPUs).
Generally, resources exist per space and process/thread
a) The kernel puts its own, protected stack on top of the user space stack (per thread) and has also separate code (also 'text'), data and heap segments. Also, each process has its own resources.
b) CPU architectures have certain requirements (depends upon the degree of support they offer), but in the end, it is the software (kernel & user space libraries used for interfacing), which create these structures.
Every reasonable OS provides at least one way to do that.
a) Try sudo cat /proc/slabinfo or simply sudo slabtop
Read 1.
a) Primarily, yes, just like user space processes are isolated from each other, except for special techniques such as CMA (Cross Memory Attach) for fast direct access in newer kernels.
Search the stack sites for recommended books
What can cause segmentation faults in C++?
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
Improve this question
memfd_secret() was merged in the kernel, but I do not see the real security benefit of it. I mean, this has the idea of avoiding sideline attacks, but this is like when the car keys are locked and nobody knows where they are.
AFAIK, the page given to the application simply is not mapped when in kernel mode, but this cant be used to isolate a virus, or whatever of the kernel itself.
How is it supposed to be safer to isolate a range of memory of the kernel?
Could someone provide a code example showing how this protects of spectre or like that?
Update
int main(int argc, char *argv[]) {
while(true) {
int rc = fork();
if(rc == -1) {
perror("fork error");
}
}
}
memfd_secret() allows a user-space process to have a "secret" memory area. In this context, "secret" means that other processes cannot have access to that memory area (not even the kernel itself, or at least not by accident).
This syscall allows a process to store confidential information (like a password or a private key) in a more secure way, because it's harder for a malware to access that secret memory area. This syscall should also protect from vulnerabilities like Spectre, because the secret memory area is uncached; and should also protect (albeit not completely, but at least partially) from kernel bugs, since the kernel has no access to that memory area.
In order to use this syscall (that will be available in Linux 5.14), you first make a call to memfd_secret() in order to obtain a file descriptor; then you make a call to ftruncate() in order to choose the size of the secret memory region; and finally you use mmap() in order to map the secret memory, so you can access it via pointers as usual.
Other details are available here.
EDIT: unfortunately, the "uncached" feature that made memfd_secret() less vulnerable to attacks like Spectre has been removed because there was a concern for perfomance.
EDIT 2: additional details about why secret memory areas obtained with memfd_secret() makes programs safer (source, slightly modified by me for clearness):
Enhanced protection (in conjunction with all the other in-kernel
attack prevention systems) against ROP attacks. Secret memory makes "simple"
ROP insufficient to perform exfiltration, which increases the required
complexity of the attack. Along with other protections like the kernel
stack size limit and address space layout randomization which make finding
gadgets is really hard, absence of any in-kernel primitive for accessing
secret memory means the one gadget ROP attack can't work. Since the only
way to access secret memory is to reconstruct the missing mapping entry,
the attacker has to recover the physical page and insert a PTE pointing to
it in the kernel and then retrieve the contents. That takes at least three
gadgets which is a level of difficulty beyond most standard attacks.
Prevent cross-process secret user-space memory exposures. Once the secret
memory is allocated, the user can't accidentally pass it into the kernel to
be transmitted somewhere. The secret memory pages cannot be accessed via the
direct map and they are disallowed in GUP.
Harden against exploited kernel flaws. In order to access secret memory, a
kernel-side attack would need to either walk the page tables and create new
ones, or spawn a new privileged user-space process to perform secrets
exfiltration using ptrace.
EDIT 3: just a note that I think may be relevant: secret memory areas can be accessed by child processes created using fork(), so one must be cautious. At least, using flag O_CLOEXEC (passed to memfd_secret()), the process will not make the secret memory available to processes created with execve().
Is there a way to allocate contiguous physical memory from userspace in linux? At least few guaranteed contiguous memory pages. One huge page isn't the answer.
No. There is not. You do need to do this from Kernel space.
If you say "we need to do this from User Space" - without anything going on in kernel-space it makes little sense - because a user space program has no way of controlling or even knowing if the underlying memory is contiguous or not.
The only reason where you would need to do this - is if you were working in-conjunction with a piece of hardware, or some other low-level (i.e. Kernel) service that needed this requirement. So again, you would have to deal with it at that level.
So the answer isn't just "you can't" - but "you should never need to".
I have written such memory managers that do allow me to do this - but it was always because of some underlying issue at the kernel level, which had to be addressed at the kernel level. Generally because some other agent on the bus (PCI card, BIOS or even another computer over RDMA interface) had the physical contiguous memory requirement. Again, all of this had to be addressed in kernel space.
When you talk about "cache lines" - you don't need to worry. You can be assured that each page of your user-space memory is contiguous, and each page is much larger than a cache-line (no matter what architecture you're talking about).
Yes, if all you need is a few pages, this may indeed be possible.
The file /proc/[pid]/pagemap now allows programs to inspect the mapping of their virtual memory to physical memory.
While you cannot explicitly modify the mapping, you can just allocate a virtual page, lock it into memory via a call to mlock, record its physical address via a lookup into /proc/self/pagemap, and repeat until you just happen to get enough blocks touching eachother to create a large enough contiguous block. Then unlock and free your excess blocks.
It's hackish, clunky and potentially slow, but it's worth a try. On the other hand, there's a decently large chance that this isn't actually what you really need.
DPDK library's memory allocator uses approach #Wallacoloo described. eal_memory.c. The code is BSD licensed.
if specific device driver exports dma buffer which is physical contiguous, user space can access through dma buf apis
so user task can access but not allocate directly
that is because physically contiguous constraints are not from user aplications but only from device
so only device drivers should care.
As part of a learning project, I've worked a bit on Spectre and Meltdown PoCs to get myself more confortable with the concept. I have managed to recover previously accessed data using the clock timers, but now I'm wondering how do they actually read physical memory from that point.
Which leads to my question : in a lot of Spectre v1\v2 examples, you can read this piece of toy-code example:
if (x<y) {
z = array[x];
}
with x supposedly being equal to : attacked_adress - adress_of_array, which will effectively lead to z getting the value at attacked_adress.
In the example it's quite easy to understand, but in reality how do they even know what attacked_adress looks like ?
Is it a virtual address with an offset, or a physical address, and how do they manage to find where is the "important memory" located in the first place ?
In the example it's quite easy to understand, but in reality how do they even know what attacked_adress looks like ?
You are right, Spectre and Meltdown are just possibilities, not a ready-to-use attacks. If you know an address to attack from other sources, Spectre and Meltdown are the way to get the data even using a browser.
Is it a virtual address with an offset, or a physical address, and how do they manage to find where is the "important memory" located in the first place ?
Sure, it is a virtual address, since it is all happening in user space program. But prior to the recent kernel patches, we had a full kernel space mapped into each user space process. That was made to speedup system calls, i.e. do just a privilege context switch and not a process context switch for each syscall.
So, due to that design and Meltdown, it is possible to read kernel space from unprivileged user space application (for example, browser) on unpatched kernels.
In general, the easiest attack scenario is to target machines with old kernels, which does not use address randomization, i.e. kernel symbols are at the same place on any machine running the specific kernel version. Basically, we run specific kernel on a test machine, write down the "important memory addresses" and then just run the attack on a victims machine using those addresses.
Have a look at my Specter-Based Meltdown PoC (i.e. 2-in-1): https://github.com/berestovskyy/spectre-meltdown
It is much simpler and easier to understand than the original code from the Specre paper. And it has just 99 lines in C (including comments).
It uses the described above technique, i.e. for Linux 3.13 it simply tries to read the predefined address 0xffffffff81800040 which is linux_proc_banner symbol located in kernel space. It runs without any privileges on different machines with kernel 3.13 and successfully reads kernel space on each machine.
It is harmless, but just a tiny working PoC.
I see in one of the career forums above question being asked in interview and I see different answers. It would be great to know what experts at SO say.
I think all the memory will be cleared and it would start afresh on the execution of the first program.
none of the virtual memory will be shared.
The loader 'may' re-use the same physical memory pages
and it 'may' re-use some of the physical memory pages
and it 'may' not re-use any of the physical memory pages.
It all depends on the current 'memory load'
Do you understand the concepts of 'mapped' memory pages and 'virtual' addresses? How about 'virtual' memory? How about the concept of 'working set' of memory pages?
Overall, these concepts result in:
The address of your program in the physical memory is not deterministic.
The address of your program in the virtual address space will probably be the same from one execution to the next.
The only reasonable answer to this question is:
"This question has no reasonable answer. The short answer is:'Nothing.', which is more or less true, depending on the underlying hardware and operating system - if there is actually one, which isn't necessary. For the C language itself, if this is a C language question, only memory that is declared static or has an explicit initializer, will have values set when it starts, so any other variable has a theoretical chance of containing values from a previous run."
I think all the memory will be cleared
The interviewer might fish for such an answer. Who would clear it and why?
The answer to this entirely depends upon the operating system. Some systems have mechanisms for processes running the same application (or library) to share the same copy. Others do not.
Older multi-user systems were better equipped for this than the systems more common today.
If it is executed simultaneously (assuming same program running), Text region (containing the code bits) most probably is shared. This is because though the virtual address of the two process are different, but the mapping to the text region in kernel can be shared to the next execution of the same binary.
The other regions of the program like data, stack etc are unique to the process execution instances which are not shared.
I'd like to share certain memory areas around multiple computers, that is, for a C/C++ project. When something on computer B accesses some memory area which is currently on computer A, that has to be locked on A and sent to B. I'm fine when its only linux compitable.
Thanks in advance :D
You cannot do this for a simple C/C++ project.
Common computer hardware does not have the physical properties that support this directly: Memory on one system cannot be read by another system.
In order to make it appear to C/C++ programs on different machines that they are sharing memory, you have to write software that provides this function. Typically, you would need to do something like this:
Allocate some pages in the virtual memory address space (of each process).
Mark those pages read-only.
Set a handler to receive the exception that occurs when the process attempts to write to the read-only memory. (This handler might be in the operating system, as some sort of kernel extension, or it might be a signal handler in your process.)
When the exception is received, determine what the process was attempting to write to memory. Write that to the page (perhaps by writing it through a separate mapping in virtual memory to the same physical memory, with this extra mapping marked writeable).
Send a message by network communications to the other machine telling it that memory has changed.
Resume execution in the process after the instruction that wrote to memory.
Additionally, you need to determine what to do about memory coherence: If two processes write to the same address in memory at nearly the same time, what happens? If process A writes to location X and then reads location Y while, at nearly the same time, process B writes to location Y and reads X, what do they see? Is it okay if the two processes see data that cannot possibly be the result of a single time sequence of writes to memory?
On top of all that, this is hugely expensive in time: Stores to memory that require exception handling and network operations take many thousands, likely hundreds of thousands, times as long as normal stores to memory. Your processes will execute excruciatingly slowly whenever they write to this shared memory.
There are software solutions, as noted in the comments. These use the paging hardware in the processors on a node to detect access, and use your local network fabric to disseminate the changes to the memory. One hardware alternative is reflective memory - you can read more about it here:
https://en.wikipedia.org/wiki/Reflective_memory
http://www.ecrin.com/embedded/downloads/reflectiveMemory.pdf
Old page was broken
http://www.dolphinics.com/solutions/embedded-system-reflective-memory.html
Reflective memory provides low latency (about one microsecond per hop) in either a ring or tree configuration.