Dynamic Library Injection

Dynamic Library Injection - c

Where does an injected DLL reside in memory?
My understanding of DLLs is that they get loaded once in physical memory so that multiple files can use their functions, thus saving resources by having reusable code loaded once instead of in every single process. The DLL's functions are located outside the process' address space and can be found via the memory mapping segment in between the heap and stack segments. For each process the memory map segment is located at a different virtual address, however the libraries/functions the memory mapped segment points to must be at the same address for each process otherwise they couldn't be shared between processes (assuming my understanding of this is all correct).
So when it's said that a DLL is loaded into a process' address space, is it loaded into the memory of that process, or somewhere else in the system and the process adds to the memory map to locate it? And on that note, what allows an injected library to migrate from one process to another?

Related

It's possible to share data-section (veriables) between twice `dlopen`ed sahrdLib in the same process? [duplicate]

I can't seem to find an answer after searching for this out on the net.
When I use dlopen the first time it seems to take longer than any time after that, including if I run it from multiple instances of a program.
Does dlopen load up the so into memory once and have the OS save it so that any following calls even from another instance of the program point to the same spot in memory?
So basically does 3 instances of a program running a library mean 3 instances of the same .so are loaded into memory, or is there only one instance in memory?
Thanks

Does dlopen load up the so into memory once and have the OS save it so that any following calls even from another instance of the program point to the same spot in memory?
Multiple calls to dlopen from within a single process are guaranteed to not load the library more than once. From the man page:
If the same shared object is loaded again with dlopen(), the same
object handle is returned. The dynamic linker maintains reference
counts for object handles, so a dynamically loaded shared object is
not deallocated until dlclose() has been called on it as many times
as dlopen() has succeeded on it.
When the first call to dlopen happens, the library is mmaped into the calling process. There are usually at least two separate mmap calls: the .text and .rodata sections (which usually reside in a single RO segment) are mapped read-only, the .data and .bss sections are mapped read-write.
A subsequent dlopen from another process performs the same mmaps. However the OS does not have to load any of the read-only data from disk -- it merely increments reference counts on the pages already loaded for the first dlopen call. That is the sharing in "shared library".
So basically does 3 instances of a program running a library mean 3 instances of the same .so are loaded into memory, or is there only one instance in memory?
Depends on what you call an "instance".
Each process will have its own set of (dynamically allocated) runtime loader structures describing this library, and each set will contain an "instance" of the shared library (which can be loaded at different address in different process). Each process will also have its own instance of writable data (which uses copy-on-write semantics). But the read-only mappings will all occupy the same physical memory (though they can appear at different addresses in each of the processes).

a single copy of a shared module’s code segment can be shared?

I'm reading a textbook which says:
modern systems compile the code segments of shared modules so that
they can be loaded anywhere in memory without having to be modified by
the linker. With this approach, a single copy of a shared module’s
code segment can be shared by an unlimited number of processes and
each process will still get its own copy of the read/write data
segment.
But how does each process get its own copy of the data segment of shared modules? isn't it a conflict? for example the address in memory for data segement of a shared library is 0x400500 to 0x400600 and the address for code segement is 0x400600 to 0x400700.
0x400600 to 0x400700 can be shared by multiple processes since functions don't have state, but if 0x400500 to 0x400600 is also shared by multiple processes, any process that makes a modification(e.g. on a global variable) will affect other processes, is that correct?

This is accomplished by virtual memory. The memory space used by a given process is completely separate from the memory space used by other processors. The kernel maps a process's virtual memory to a piece of physical memory.
So for the data segment of a shared library, two processes may have the same virtual memory address for that segment but each is mapped to a different physical memory segment. For the code segment of the library, both processes virtual memory can map to the same physical memory since the segment is read only.

Can I with PTEs from one process which indicate to fragments of physical memory to create appropriate PTEs in other process?

When we in Linux use function mmap (,,, MAP_ANON | MAP_SHARED);, then for the same region of fragmented physically memory (which allocated) between processes are allocating virtual memory pages (PTEs). Ie these PTEs are copied from page table of one process to the page table of another process (with the same sequence of fragments of physical addresses allocated memory), is this true?
But mmap () needs to be done before fork (). And if we already have two working process (ie after fork ()), then we need to use a file for the mmap(). Which functions used to copying mechanism of PTEs between the two already established processes to create a shared memory?
Can I with PTEs/SGL(scatter-gather-list) which indicate to fragments of physical memory which have been allocated to create appropriate PTEs in other process by using linux-kernel, and how to do it?
I want to understand how it mmap() works at a lower level .

When we in Linux use function mmap (,,, MAP_ANON | MAP_SHARED);, then
for the same region of fragmented physically memory (which allocated)
between processes are allocating virtual memory pages (PTEs).
Restate the question/statement, please, the above does not make sense.
Ie these PTEs are copied from page table of one process to the page
table of another process (with the same sequence of fragments of
physical addresses allocated memory), is this true?
No, it is not true.
When you establish a new mapping, a kernel first looks
for a sufficiently large unused range of addresses in the virtual address space of the process. Then it modifies the corresponding page table entries to indicate that that address range is valid, but physical pages there are not present.
When you attempt to access an address in that range, a page fault is generated. The kernel looks in its data structures and determines that the access is valid. Then it allocates a
fresh physical page, modifies the page entry to establish the mapping between the
virtual address and the physical address and marks the page as present. Upon return from
the page fault exception, the offending instruction is restarted and this time executes successfully.
But mmap () needs to be done before fork (). And if we already have
two working process (ie after fork ()), then we need to use a file for
the mmap(). Which functions used to copying mechanism of PTEs between
the two already established processes to create a shared memory?
If you do a mmap after the fork, the two processes will create and initialize
page table entries entirely independent of each other. However, when you mmap a file,
the kernel will not allocate simply a free physical page - it will allocate a page,
fill it with data from the file and put the page in the page/buffer cache. When a second
process mmaps the same file, the kernel looks in the page cache, finds there the physical
page, which corresponds to the same file and the required file offset and points the PTE
to that page. Now, there will be two completely independently created PTE, which just point to the same physical page.
Can I with PTEs/SGL(scatter-gather-list) which indicate to fragments
of physical memory which have been allocated to create appropriate
PTEs in other process by using linux-kernel, and how to do it?
Restate this too, it's not clear what you are asking.
I want to understand how it mmap() works at a lower level .
I would recommend an operating systems book, a chapter on virtual memory management,
something like Operating System Concepts by Silberschatz el al.
http://www.amazon.co.uk/Operating-System-Concepts-Abraham-Silberschatz/dp/1118112733/ref=sr_1_5?ie=UTF8&qid=1386065707&sr=8-5&keywords=Operating+System+Concepts%2C+by+Silberschatz%2C+Galvin%2C+and+Gagne

shmat for attaching shared memory segment

When I looked through the man pages of shmat. It is described as the primitive function of the API is to attach the memory segment associated wih shmid it to the calling process' address space .
The questions I have are the following.
The term attach looks generic to me. I find difficulties in understanding what is the underlying acivity that attach refers to.?
What it means by mapping a segment of memory?

Use it as char *ptr=shmat(seg_id,NULL,0);
It attaches the created segment id by function shmget() with the process which contains this above code.
seg_id is the segment id of newly created segment
NULL means the Operating System will take care of the starting address of the segment on user's behalf
0 is flag for read/write both
Whenever a process attaches to shared memory then it must be detached so that another process can access it by attaching to that segment (if the locking mechanism of resources is present.)
to detach : shmdt(ptr);

There's a good explanation here: http://www.makelinux.net/alp/035
"Under Linux, each process's virtual memory is split into pages. Each process maintains a mapping from its memory addresses to these virtual memory pages, which contain the actual data. Even though each process has its own addresses, multiple processes' mappings can point to the same page, permitting sharing of memory"

Do shared libraries use the same heap as the application?

Say I have an application in Linux that uses shared libraries (.so files). My question is whether the code in those libraries will allocate memory in the same heap as the main application or do they use their own heap?
So for example, some function in the .so file calls malloc, would it use the same heap manager as the application or another one? Also, what about the global data in those shared memories. Where does it lie? I know for the application it lies in the bss and data segment, but don't know where it is for those shared object files.

My question is whether the code in those libraries will allocate memory in the same heap as the main application or do they use their own heap?
If the library uses the same malloc/free as the application (e.g. from glibc) - then yes, program and all libraries will use the single heap.
If library uses mmap directly, it can allocate memory which is not the memory used by program itself.
So for example, some function in the .so file calls malloc, would it use the same heap manager as the application or another one?
If function from .so calls malloc, this malloc is the same as malloc called from program. You can see symbol binding log in Linux/glibc (>2.1) with
LD_DEBUG=bindings ./your_program
Yes, several instances of heap managers (with default configuration) can't co-exist without knowing about each other (the problem is with keeping brk-allocated heap size synchronized between instances). But there is a configuration possible when several instances can co-exist.
Most classic malloc implementations (ptmalloc*, dlmalloc, etc) can use two methods of getting memory from the system: brk and mmap. Brk is the classic heap, which is linear and can grow or shrink. Mmap allows to get lot of memory in anywhere; and you can return this memory back to the system (free it) in any order.
When malloc is builded, the brk method can be disabled. Then malloc will emulate linear heap using only mmaps or even will disable classic linear heap and all allocations will be made from discontiguous mmaped fragmens.
So, some library can have own memory manager, e.g. malloc compiled with brk disabled or with non-malloc memory manager. This manager should have function names other than malloc and free, for example malloc1 and free1 or should not to show/export this names to dynamic linker.
Also, what about the global data in those shared memories. Where does it lie? I know for the application it lies in the bss and data segment, but don't know where it is for those shared object files.
You should think both about program and .so just as ELF files. Every ELF file has "program headers" (readelf -l elf_file). The way how data is loaded from ELF into memory depends on program header's type. If the type is "LOAD", corresponding part of file will be privately mmaped (Sic!) to memory. Usually, there are 2 LOAD segments; first one for code with R+X (read+execute) flags and second is for data with R+W (read+write) flags. Both .bss and .data (global data) sections are placed in the segment of type LOAD with Write enabled flag.
Both executable and shared library has LOAD segments. Some of segments has memory_size > file_size. It means that segment will be expanded in memory; first part of it will be filled with data from ELF file, and the second part of size (memory_size-file_size) will be filled with zero (for *bss sections), using mmap(/dev/zero) and memset(0)
When Kernel or Dynamic linker loads ELF file into memory, they will not think about sharing. For example, you want to start same program twice. First process will load read-only part of ELF file with mmap; second process will do the same mmap (if aslr is active - second mmap will be into different virtual address). It is task of Page cache (VFS subsystem) to keep single copy of data in physical memory (with COPY-on-WRITE aka COW); and mmap will just setup mappings from virtual address in each process into single physical location. If any process will change a memory page; it will be copied on write to unique private physical memory.
Loading code is in glibc/elf/dl-load.c (_dl_map_object_from_fd) for ld.so and linux-kernel/fs/binfmt_elf.c for kernel's ELF loader (elf_map, load_elf_binary). Do a search for PT_LOAD.
So, global data and bss data is always privately mmaped in each process, and they are protected with COW.
Heap and stack are allocated in run-time with brk+mmap (heap) and by OS kernel automagically in brk-like process (for stack of main thread). Additional thread's stacks are allocated with mmap in pthread_create.

Symbol tables are shared across an entire process in Linux. malloc() for any part of the process is the same as all the other parts. So yes, if all parts of a process access the heap via malloc() et alia then they will share the same heap.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight