mmap substitute for malloc - c

I need to find a way to use mmap instead of malloc. How is this possible? (I am not using libc only syscalls) And yes brk() is possible. I used sbrk() but realized its not sys-call... (x86 inline assembly)
I've been looking around and saw this: How to use mmap to allocate a memory in heap? But it didn't help for me, because I had a segfault.
Basically, all I want to do a create 3 slabs of memory for storing characters.
Say,
char * x = malloc(1000);
char * y = malloc(2000);
char * z = malloc(3000);
How is this possible with mmap and how to free it later with munmap?

Did you carefully read the mmap(2) man page? I recommend reading it several times.
Notice that you can only ask the kernel [thru mmap etc...] to manage memory aligned to and multiple of the page size sysconf(_SC_PAGE_SIZE) which is often 4096 bytes (and I am supposing that in my answer).
Then you might do:
size_t page_size = sysconf(_SC_PAGE_SIZE);
assert (page_size == 4096); // otherwise this code is wrong
// 1000 bytes fit into 1*4096
char *x = mmap (NULL, page_size, PROT_READ|PROT_WRITE,
MAP_ANONYMOUS, -1, (off_t)0);
if (x == MMAP_FAILED) perror("mmap x"), exit (EXIT_FAILURE);
// 2000 bytes fit into 1*4096
char *y = mmap (NULL, page_size, PROT_READ|PROT_WRITE,
MAP_ANONYMOUS, -1, (off_t)0);
if (y == MMAP_FAILED) perror("mmap y"), exit (EXIT_FAILURE);
later to free the memory, use
if (munmap(x, page_size))
perror("munmap x"), exit(EXIT_FAILURE);
etc
If you want to allocate 5Kbytes, you'll need two pages (because 5Kbytes < 2*4096 and 5Kbytes > 1*4096) i.e. mmap(NULL, 2*page_size, ...
Actually, all of your x, y, z takes only 8000 bytes and could fit into two, not three, pages... But then you could only munmap that memory together.
Be aware that mmap is a system call which might be quite expensive. malloc implementations take care to avoid calling it too often, that is why they manage previously free-d zones to reuse them later (in further malloc-s) without any syscall. In practice, most malloc implementations manage differently big allocations (e.g. more than a megabyte), which are often mmap-ed at malloc and munmap-ed at free time.... You could study the source code of some malloc. The one from MUSL Libc might be easier to read than the Glibc malloc.
BTW, the file /proc/1234/maps is showing you the memory map of process of pid 1234. Try also cat /proc/self/maps in a terminal, it shows the memory map of that cat process.

You can call mmap to make an anonymous mapping in x86 asm with something like:
mov eax, 192 ; mmap
xor ebx, ebx ; addr = NULL
mov ecx, 4096 ; len = 4096
mov edx, $7 ; prot = PROT_READ|PROT_WRITE|PROT_EXEC
mov esi, $22 ; flags = MAP_PRIVATE|MAP_ANONYMOUS
mov edi, -1 ; fd = -1 (Ignored for MAP_ANONYMOUS)
xor ebp, ebp ; offset = 0 (4096*0) (Ignored for MAP_ANONYMOUS)
int $80 ; make call (There are other ways to do this too)

Related

mallinfo doesn't show mmap allocation's information

In mallinfo structure there are two fields hblks and hblkhd. The man documentation says that they are responsible for the number of blocks allocated by mmap and the total number of bytes. But when I run next code
void * ptr = mmap(NULL, 4096, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
*(int *) ptr = 10;
Fields hblks and hblkhd are also zero. While the total number of free bytes in the blocks decreases. Could you please explain why this behavior is observed?
I also tried to allocate all free space and use mmap after it. But in this situation fields also equal to zero
Compiler: gcc 9.4.0
OS: Ubuntu 20.04.1
I did some experiments and they led me to the conclusion that this field is filled only when mmap occurred when calling malloc. A normal mmap call doesn't show up in this statistic, which is logical, because this is a system call, and the statistic is collected in user-space

Remapping stack succeeds, but later SEGV is raised

I ran a simple program written in assembly that under strace that simply executes SYS_exit.
_start:
mov rax, 0x3C
mov rdi, 0x0
syscall
And noticed that there were nothing like mmap memory for the stack:
alrorp#dmspc:~$ strace ./bin
execve("./bin", ["./bin"], 0x7ffd591eda80 /* 65 vars */) = 0
exit(0) = ?
+++ exited with 0 +++
So I tried to do mmap with MAP_FIXED to the stack page-aligned address as follows:
int main(void){
int a = 1;
void *ptr = &a;
void *page_aligned_ptr = (void *)((intptr_t) ptr & -4096);
mmap(page_aligned_ptr, 4096, PROT_READ | PROT_WRITE, MAP_FIXED | MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
}
The thing is it segfaults after the call to mmap succeeds (i.e. it returns the requested address instead of MAP_FAILED).
mmap(0x7ffdf50db000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7ffdf50db000
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=NULL} ---
+++ killed by SIGSEGV (core dumped) +++
Segmentation fault
Can you give any hint about this behavior? Core dump seems to be (almost) useless in that case with stack corrupted.
Does something like create a custom mapping for stack even make sense?
Replacing the stack page containing your return address with a new anonymous page of zero bytes obvious leads to a segfault as soon as main returns, and pops 0 into RIP.
Note the si_addr=NULL, IIRC that's the code address where the fault happened. So RIP=0 after running a ret with RSP pointing at a 0. (The ret itself won't fault, but code-fetch from address 0 will.)
Or actually the segfault will be inside the libc wrapper for mmap, which itself has to ret.
Use a debugger to single-step the asm the C compiler created for you.

Predict malloc block sizes grid in C

I'm trying to optimize my dynamic memory usage. The thing is that I initially allocate some amount of memory for the data I get from a socket. Then, on the new data arrival I'm reallocating memory so the newly arrived part will fit into the local buffer. After some poking around I've found that malloc actually allocates a greater block than requested. In some cases significantly greater; here comes some debug info from malloc_usable_size(ptr):
requested 284 bytes, allocated 320 bytes
requested 644 bytes, reallocated 1024 bytes
It's well known that malloc/realloc are expensive operations. In most cases newly arrived data will fit into a previously allocated block (at least when I requested 644 byes and get 1024 instead), but I have no idea how I can figure that out.
The trouble is that malloc_usable_size should not be relied upon (as described in manual) and if the program requested 644 bytes and malloc allocated 1024, the excess 644 bytes may be overwritten and can not be used safely. So, using malloc for a given amount of data and then use malloc_usable_size to figure out how many bytes were really allocated isn't the way to go.
What I want is to know the block grid before calling malloc, so I will request exactly the maximum amount of bytes greater then I need, store allocated size and on the realloc check if I really need to realloc, or if the previously allocated block is fine just because it's greater.
In other words, if I were to request 644 bytes, and malloc actually gave me 1024, I want to have predicted that and requested 1024 instead.
Depending on your particular implementation of libc you will have different behaviour. I have found in most cases two approaches to do the trick:
Use the stack, this is not always feasible, but C allows VLAs on the stack and is the most effective if you don't intend to pass your buffer to an external thread
while (1) {
char buffer[known_buffer_size];
read(fd, buffer, known_buffer_size);
// use buffer
// released at the end of scope
}
In Linux you can make excellent use of mremap which can enlarge/shrink memory with zero-copy guaranteed. It may move your VM mapping though. Only problem here is that it only works in chunks of system page size sysconf(_SC_PAGESIZE) which is usually 0x1000.
void * buffer = mmap(NULL, init_size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
while(1) {
// if needs remapping
{
// zero copy, but involves a system call
buffer = mremap(buffer, new_size, MREMAP_MAYMOVE);
}
// use buffer
}
munmap(buffer, current_size);
OS X has similar semantics to Linux's mremap through the Mach vm_remap, it's a little more compilcated though.
void * buffer = mmap(NULL, init_size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
mach_port_t this_task = mach_task_self();
while(1) {
// if needs remapping
{
// zero copy, but involves a system call
void * new_address = mmap(NULL, new_size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
vm_prot_t cur_prot, max_prot;
munmap(new_address, current_size); // vm needs to be empty for remap
// there is a race condition between these two calls
vm_remap(this_task,
&new_address, // new address
current_size, // has to be page-aligned
0, // auto alignment
0, // remap fixed
this_task, // same task
buffer, // source address
0, // MAP READ-WRITE, NOT COPY
&cur_prot, // unused protection struct
&max_prot, // unused protection struct
VM_INHERIT_DEFAULT);
munmap(buffer, current_size); // remove old mapping
buffer = new_address;
}
// use buffer
}
The short answer is that the standard malloc interface does not provide the information you are looking for. To use the information breaks the abstraction provided.
Some alternatives are:
Rethink your usage model. Perhaps pre-allocate a pool of buffers at start, filling them as you go. Unfortunately this could complicate your program more than you would like.
Use a different memory allocation library that does provide the needed interface. Different libraries provide different tradeoffs in terms of fragmentation, max run time, average run time, etc.
Use your OS memory allocation API. These are often written to be efficient, but will generally require a system call (unlike a user-space library).
In my professional code, I often take advantage of the actual size allocated by malloc()[etc], rather than the requested size. This is my function for determining the actual allocation size0:
int MM_MEM_Stat(
void *I__ptr_A,
size_t *_O_allocationSize
)
{
int rCode = GAPI_SUCCESS;
size_t size = 0;
/*-----------------------------------------------------------------
** Validate caller arg(s).
*/
#ifdef __linux__ // Not required for __APPLE__, as alloc_size() will
// return 0 for non-malloc'ed refs.
if(NULL == I__ptr_A)
{
rCode=EINVAL;
goto CLEANUP;
}
#endif
/*-----------------------------------------------------------------
** Calculate the size.
*/
#if defined(__APPLE__)
size=malloc_size(I__ptr_A);
#elif defined(__linux__)
size=malloc_usable_size(I__ptr_A);
#else
!##$%
#endif
if(0 == size)
{
rCode=EFAULT;
goto CLEANUP;
}
/*-----------------------------------------------------------------
** Return requested values to caller.
*/
if(_O_allocationSize)
*_O_allocationSize = size;
CLEANUP:
return(rCode);
}
I did some sore research and found two interesting things about malloc realization in Linux and FreeBSD:
1) in Linux malloc increment blocks linearly in 16 byte steps, at least up to 8K, so no optimization needed at all, it's just not reasonable;
2) in FreeBSD situation is different, steps are bigger and tend to grow up with requested block size.
So, any kind of optimization is needed only for FreeBSD as Linux allocates blocks with a very tiny steps and it's very unlikely to receive less then 16 bytes of data from socket.

Can memory allocated through mmap overlap the data segment

The malloc function uses both sbrk and mmap functions. Now the sbrk function increases or decreases the data segment. So it grows linearly. Now my question is, is this linearity always maintained, or for example, an mmap call can allocate memory overlapping the data segment?
I'm talking about multithreaded programs running on multicore systems. This blog talks about some serious flaws of sbrk for multithreaded programs, and it points out that it is possible that memory allocated with sbrk can be intermingled with memory alloacted with mmap (The sbrk heap could become discontinuous because a mmaped region or a shared object obstructs the growth of the heap).
That blog post doesn't see the forest for the trees; only the malloc implementation is allowed to call sbrk with a nonzero argument. More precisely, most malloc implementations for Unix will stop functioning correctly (and by that I mean "your program will crash") if application code calls sbrk with a nonzero argument. If you want to make a large allocation directly from the OS you must use mmap to do it.
(It is true that in a multi-threaded program, malloc must internally wrap a mutex around its calls to sbrk, but that's an implementation detail. POSIX says malloc is thread safe, that's the important thing for an application programmer.)
mmap will not allocate memory overlapping the brk area unless you use MAP_FIXED. If you use MAP_FIXED and your program blows up you get to keep all the pieces.
The kernel tries to avoid doing it, but mmap in normal operation could conceivably allocate memory close to the top of the brk area. If this happens, a subsequent sbrk call that would collide with the mmap region will fail. It will not allocate discontiguous memory. Good implementations of malloc ought to detect this condition and start using mmap for everything. I have not actually tried it, but a test program would be pretty easy to write.
is this linearity always maintained, or for example, an mmap call can allocate memory overlapping the data segment?
Observed behavior is that the brk area is always linear. Implementation details: If enlarging the brk area is not possible, for example due to a blocking mapping, glibc will switch to mmap-only. Small allocations (<128KB) seem to be obtained by glibc via brk if possible, so blocking that with:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/mman.h>
int main(void)
{
int i;
for (i = 0; i < 1024; ++i) {
malloc(2048);
if (i == 512) {
void *r, *end = sbrk(0);
r = mmap(end, 4096, PROT_NONE,
MAP_PRIVATE|MAP_ANONYMOUS, 0, 0);
}
}
}
when straced, yields indeed
[...]
brk(0x1e7d000) = 0x1e7d000
mmap(0x1e7d000, 4096, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS, 0, 0) = 0x1e7d000
brk(0x1e9e000) = 0x1e7d000 <-- (!)
mmap(NULL, 1048576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fbfd9bc9000

I need to recover my source code from the executable

It's the middle of the night, and I've accidently overwritten all my work by typing
gcc source.c -o source.c
I still have the original binary and my only hope is to dissemble it, but I don't know how or what the best tool to use to get the most readable result. I know this probably isn't the right place to post but I'm stressing out. Can someone help me out please?
Thanks for uploading the file. As I suspected, it was unstripped so the function names remained. Besides standard boilerplate code I could identify functions main, register_broker, connect_exchange (unused and empty) and handle_requests.
I spent a bit of time in IDA Pro and it wasn't too hard to recover the main() function. First, here's the original, unmodified listing of main() from IDA: http://pastebin.com/sBxhRJMM
To proceed, you need to familiarize yourself with AMD64 calling convention. To summarize, the first four arguments are passed in RDI(EDI), RSI(ESI), RDX(EDX) and RCX(ECX). The rest is passed on the stack, but all calls in main() use only up to four arguments so we don't need to worry about that.
IDA has helpfully labeled arguments of the standard C functions and even renamed some local variables. However, it can be improved and commented further. For example, since we're in main(), we know that argc (first argument) comes from EDI (since it's an int meaning 32-bit, it uses only the low half of RDI) and argv comes from RSI (it's a pointer so it uses the full 8 bytes of the register). So, we can rename the local variables into which EDI and RSI are copied:
mov [rbp+argc], edi
mov [rbp+argv], rsi
Next is a simple conditional block:
cmp [rbp+argc], 2
jz short loc_400EB3
mov rax, cs:stderr##GLIBC_2_2_5
mov rdx, rax
mov eax, offset aUsage ; "Usage"
mov rcx, rdx ; s
mov edx, 5 ; n
mov esi, 1 ; size
mov rdi, rax ; ptr
call _fwrite
mov edi, 1 ; status
call _exit
Here we compare argc with 2, and if it is equal, we jump further in the code. If it is not equal, we call fwrite(). The first argument to it is in rdi, and rdi is loaded from rax, which holds the address of a constant string "Usage". The second argument is in esi and is 1, the third in edx and is 5, the fourth in rcx, which is loaded from rdx which has the value of stderr##GLIBC_2_2_5, which is basically a fancy reference to the stderr variable from libc. Stringing it all up together, we get:
fwrite("Usage", 1, 5, stderr);
From my experience, I can say that most likely it is an inlined fprintf, since 5 is exactly the length of the string. I.e. the original code probably was:
fprintf(stderr, "Usage");
Next call is a simple exit(1);. Combining both with the comparison, we get:
if ( argc != 2 )
{
fprintf(stderr, "Usage");
exit(1);
}
Continuing in this vein, we can identify other calls and variables they use. It's somewhat tedious to describe it all, so I uploaded a commented version of the disassembly, where I tried to show the equivalent C code for each call. You can see it here: http://pastebin.com/p5sRSwgQ
From that commented version it's not very hard to imagine a possible version of main():
int main(int argc, char **argv)
{
if ( argc != 2 )
{
fprintf(stderr, "Usage");
exit(1);
}
char name[256];
gethostname(name, sizeof(name));
struct hostent* _hostent = gethostbyname(name);
struct in_addr *_addr0 = (struct in_addr *)(_hostent->h_addr_list[0]);
struct sockaddr_in addr;
addr.sin_family = AF_INET;
addr.sin_port = htons(0);
addr.sin_addr.s_addr = _addr0->s_addr;
char *tmp = (char *)malloc(6);
sprintf(tmp, "%d", addr.sin_port);
char *ip_str = inet_ntoa(*_addr0);
char *newbuf = (char *)malloc(strlen(argv[1]) + strlen(ip_str) + strlen(tmp) + 5);
strcpy(newbuf, "r");
strcat(newbuf, " ");
strcat(newbuf, argv[1]);
strcat(newbuf, " ");
strcat(newbuf, ip_str);
strcat(newbuf, " ");
strcat(newbuf, tmp);
register_broker(newbuf);
int fd = socket(PF_INET, SOCK_STREAM, 0);
if ( fd < 0 )
{
perror("Error creating socket");
exit(1);
}
if ( bind(fd, (struct sockaddr*)&addr, sizeof(addr)) != 0 )
{
perror("Error binding socket");
exit(1);
}
if ( listen(fd, 0x80) != 0 )
{
perror("Error listening on socket");
exit(1);
}
handle_requests(fd);
}
Recovering the other two functions is left an exercise for the reader :)
There are several tools (you can search with Google) but I would suggest to re-code it. The time you will invest into refactoring what a disassebler will return is probably higher than re-coding.
I know it seems obvious but the correct answer would be: restore from a backup (that you should have)
There is unfortunately really no good way to go from the binary back to the source. You can try Boomerang, but I really don't expect good results.
Firstly, look for a backup source file. Most editors create files named .bak or filename.c~ with each file save. On a Windows machine, a forensic software tool might be able to retrieve the last source file(s). The tool I wrote, getfile used to be offered by NTI, but was acquired by Armor Holdings a few years ago—no idea if it is still available.
If the code is runnable, oftentimes running it under the strace() utility (a standard component of Linux distributions) can help with some aspects of decoding the program, especially if it is i/o oriented. Alas, if the program is mostly internal data manipulation, this is not of much use. Strace() creates a log of the system calls and parameters passed by the program; it is an invaluable tool at times for understanding how a program behaves. for example, strace date produces (in part—I've omitted the runtime library startup):
clock_gettime(CLOCK_REALTIME, {1315760058, 681379835}) = 0
open("/etc/localtime", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=2819, ...}) = 0
fstat64(3, {st_mode=S_IFREG|0644, st_size=2819, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb78b5000
read(3, "TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\4\0\0\0\4\0\0\0\0"..., 4096) = 2819
_llseek(3, -24, [2795], SEEK_CUR) = 0
read(3, "\nPST8PDT,M3.2.0,M11.1.0\n", 4096) = 24
_llseek(3, 2818, [2818], SEEK_SET) = 0
close(3) = 0
munmap(0xb78b5000, 4096) = 0
fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb78b5000
write(1, "Sun Sep 11 09:54:18 PDT 2011\n", 29Sun Sep 11 09:54:18 PDT 2011) = 29
close(1) = 0
munmap(0xb78b5000, 4096) = 0
close(2) = 0
As soon as you have anything worth saving:
Add some sort of source control (git, svn, cvs, ...) maybe more than one
Use an automated build tool, like make to avoid silly mistakes
Make backups once in a while. Even when I am at a stone-knives-and-bear-skins client, I can still email myself source files for a last-ditch backup mechanism.
You can use dcc. But, next time, you should use Git ;)
You can try disassembling with objdump -d <filename>.
You can also look at the symbol names with the nm utility to jog your memory and help recode the source.
The commercial IDA Pro disassembler/debugger is popular in software reverse engineering. Unfortunately, reverse engineering a binary is slow and difficult work.

Resources