mallinfo doesn't show mmap allocation's information

mallinfo doesn't show mmap allocation's information - c

In mallinfo structure there are two fields hblks and hblkhd. The man documentation says that they are responsible for the number of blocks allocated by mmap and the total number of bytes. But when I run next code
void * ptr = mmap(NULL, 4096, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
*(int *) ptr = 10;
Fields hblks and hblkhd are also zero. While the total number of free bytes in the blocks decreases. Could you please explain why this behavior is observed?
I also tried to allocate all free space and use mmap after it. But in this situation fields also equal to zero
Compiler: gcc 9.4.0
OS: Ubuntu 20.04.1

I did some experiments and they led me to the conclusion that this field is filled only when mmap occurred when calling malloc. A normal mmap call doesn't show up in this statistic, which is logical, because this is a system call, and the statistic is collected in user-space

Related

Using mmap() instead of malloc()

I am trying to complete an exercise that is done with system calls and need to allocate memory for a struct *. My code is:
myStruct * entry = (myStruct *)mmap(0, SIZEOF(myStruct), PROT_READ|PROT_WRITE,
MAP_ANONYMOUS, -1, 0);
To clarify, I cannot use malloc() but can use mmap(). I was having no issues with this on Windows in Netbeans, now however I'm compiling and running from command line on Ubuntu I am getting "Segmentation Fault" each time I try to access it.
Is there a reason why it will work on one and not the other, and is mmap() a valid way of allocating memory in this fashion? My worry was I was going to be allocating big chunks of memory for each mmap() call initially, now I just cannot get it to run.
Additionally, the error returned my mmap is 22 - Invalid Argument (I did some troubleshooting while writing the question so the error check isn't in the above code). Address is 0, the custom SIZEOF() function works in other mmap arguments, I am using MAP_ANONYMOUS so the fd and offsetparameters must -1 and 0 respectively.
Is there something wrong with the PROT_READ|PROT_WRITE sections?

You need to specify MAP_PRIVATE in your flags.
myStruct * entry = (myStruct *)mmap(0, SIZEOF(myStruct),
PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
From the manual page:
The flags argument determines whether updates to the mapping are
visible to other processes mapping the same region, and whether
updates are carried through to the underlying file. This behavior is
determined by including exactly one of the following values in flags:
You need exactly one of the flags MAP_PRIVATE or MAP_SHARED - but you didn't give either of them.
A complete example:
#include <sys/mman.h>
#include <stdio.h>
typedef struct
{
int a;
int b;
} myStruct;
int main()
{
myStruct * entry = (myStruct *)mmap(0, sizeof(myStruct),
PROT_READ|PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
if (entry == MAP_FAILED) {
printf("Map failed.\n");
}
else {
entry->a = 4;
printf("Success: entry=%p, entry->a = %d\n", entry, entry->a);
}
return 0;
}
(The above, without MAP_PRIVATE of course, is a good example of what you might have provided as a an MCVE. This makes it much easier for others to help you, since they can see exactly what you've done, and test their proposed solutions. You should always provide an MCVE).

The man page for mmap() says that you must specify exactly one of MAP_SHARED and MAP_PRIVATE in the flags argument. In your case, to act like malloc(), you'll want MAP_PRIVATE:
myStruct *entry = mmap(0, sizeof *entry,
PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
(I've also made this more idiomatic C by omitting the harmful cast and matching the sizeof to the actual variable rather than its type).

Is there a limit on memory allocated using huge pages?

I am allocating memory using "huge pages(1MB size)" and using mmap. After allocating 4 GB of memory ,mmap returns fail.
mmap(NULL, memsize, PROT_READ | PROT_WRITE,MAP_PRIVATE | MAP_ANONYMOUS |MAP_POPULATE | MAP_HUGETLB, -1, 0);
here memsize = 1GB
I am calling above statement in a loop. Upto 4 iterations it is fine. In 5th iteration mmap is failed.
mmap(NULL, memsize, PROT_READ | PROT_WRITE,MAP_PRIVATE | MAP_ANONYMOUS |MAP_POPULATE , -1, 0);
Above statement (without hugepages) works perfectly for any number of iterations. Am I missing any information related to hugepages?
I tried "MAP_NORESERVE" flag also as mentioned in mmap fail after 4GB.
Any sort of information will be greatly appreciated. Thank you.

Change the allocated "number of huge pages" in file
/proc/sys/vm/nr_hugepages
according to the amount of memory you want to allocate.
Earlier it says:
>cat /proc/meminfo | grep HugePages
HugePages_Total = 2500
4GB => it has 2048*2Mb= 4Gb
2048 huge pages already consumed.
one more GB of memory need (1GB/2MB= 512) 512 more huge pages. But 2500 - 2048 =452 only left. But you need 512. Thats the problem why mmap failed. If you edit the above mentioned file(/proc/sys/vm/nr_hugepages) contents to 2560, it allows 5GB. Change it according to the amount of memory you need. Thanks to # Klas Lindbäck. I referred back the link, small research exposed the working

Predict malloc block sizes grid in C

I'm trying to optimize my dynamic memory usage. The thing is that I initially allocate some amount of memory for the data I get from a socket. Then, on the new data arrival I'm reallocating memory so the newly arrived part will fit into the local buffer. After some poking around I've found that malloc actually allocates a greater block than requested. In some cases significantly greater; here comes some debug info from malloc_usable_size(ptr):
requested 284 bytes, allocated 320 bytes
requested 644 bytes, reallocated 1024 bytes
It's well known that malloc/realloc are expensive operations. In most cases newly arrived data will fit into a previously allocated block (at least when I requested 644 byes and get 1024 instead), but I have no idea how I can figure that out.
The trouble is that malloc_usable_size should not be relied upon (as described in manual) and if the program requested 644 bytes and malloc allocated 1024, the excess 644 bytes may be overwritten and can not be used safely. So, using malloc for a given amount of data and then use malloc_usable_size to figure out how many bytes were really allocated isn't the way to go.
What I want is to know the block grid before calling malloc, so I will request exactly the maximum amount of bytes greater then I need, store allocated size and on the realloc check if I really need to realloc, or if the previously allocated block is fine just because it's greater.
In other words, if I were to request 644 bytes, and malloc actually gave me 1024, I want to have predicted that and requested 1024 instead.

Depending on your particular implementation of libc you will have different behaviour. I have found in most cases two approaches to do the trick:
Use the stack, this is not always feasible, but C allows VLAs on the stack and is the most effective if you don't intend to pass your buffer to an external thread
while (1) {
char buffer[known_buffer_size];
read(fd, buffer, known_buffer_size);
// use buffer
// released at the end of scope
}
In Linux you can make excellent use of mremap which can enlarge/shrink memory with zero-copy guaranteed. It may move your VM mapping though. Only problem here is that it only works in chunks of system page size sysconf(_SC_PAGESIZE) which is usually 0x1000.
void * buffer = mmap(NULL, init_size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
while(1) {
// if needs remapping
{
// zero copy, but involves a system call
buffer = mremap(buffer, new_size, MREMAP_MAYMOVE);
}
// use buffer
}
munmap(buffer, current_size);
OS X has similar semantics to Linux's mremap through the Mach vm_remap, it's a little more compilcated though.
void * buffer = mmap(NULL, init_size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
mach_port_t this_task = mach_task_self();
while(1) {
// if needs remapping
{
// zero copy, but involves a system call
void * new_address = mmap(NULL, new_size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
vm_prot_t cur_prot, max_prot;
munmap(new_address, current_size); // vm needs to be empty for remap
// there is a race condition between these two calls
vm_remap(this_task,
&new_address, // new address
current_size, // has to be page-aligned
0, // auto alignment
0, // remap fixed
this_task, // same task
buffer, // source address
0, // MAP READ-WRITE, NOT COPY
&cur_prot, // unused protection struct
&max_prot, // unused protection struct
VM_INHERIT_DEFAULT);
munmap(buffer, current_size); // remove old mapping
buffer = new_address;
}
// use buffer
}

The short answer is that the standard malloc interface does not provide the information you are looking for. To use the information breaks the abstraction provided.
Some alternatives are:
Rethink your usage model. Perhaps pre-allocate a pool of buffers at start, filling them as you go. Unfortunately this could complicate your program more than you would like.
Use a different memory allocation library that does provide the needed interface. Different libraries provide different tradeoffs in terms of fragmentation, max run time, average run time, etc.
Use your OS memory allocation API. These are often written to be efficient, but will generally require a system call (unlike a user-space library).

In my professional code, I often take advantage of the actual size allocated by malloc()[etc], rather than the requested size. This is my function for determining the actual allocation size0:
int MM_MEM_Stat(
void *I__ptr_A,
size_t *_O_allocationSize
)
{
int rCode = GAPI_SUCCESS;
size_t size = 0;
/*-----------------------------------------------------------------
** Validate caller arg(s).
*/
#ifdef __linux__ // Not required for __APPLE__, as alloc_size() will
// return 0 for non-malloc'ed refs.
if(NULL == I__ptr_A)
{
rCode=EINVAL;
goto CLEANUP;
}
#endif
/*-----------------------------------------------------------------
** Calculate the size.
*/
#if defined(__APPLE__)
size=malloc_size(I__ptr_A);
#elif defined(__linux__)
size=malloc_usable_size(I__ptr_A);
#else
!##$%
#endif
if(0 == size)
{
rCode=EFAULT;
goto CLEANUP;
}
/*-----------------------------------------------------------------
** Return requested values to caller.
*/
if(_O_allocationSize)
*_O_allocationSize = size;
CLEANUP:
return(rCode);
}

I did some sore research and found two interesting things about malloc realization in Linux and FreeBSD:
1) in Linux malloc increment blocks linearly in 16 byte steps, at least up to 8K, so no optimization needed at all, it's just not reasonable;
2) in FreeBSD situation is different, steps are bigger and tend to grow up with requested block size.
So, any kind of optimization is needed only for FreeBSD as Linux allocates blocks with a very tiny steps and it's very unlikely to receive less then 16 bytes of data from socket.

Can memory allocated through mmap overlap the data segment

The malloc function uses both sbrk and mmap functions. Now the sbrk function increases or decreases the data segment. So it grows linearly. Now my question is, is this linearity always maintained, or for example, an mmap call can allocate memory overlapping the data segment?
I'm talking about multithreaded programs running on multicore systems. This blog talks about some serious flaws of sbrk for multithreaded programs, and it points out that it is possible that memory allocated with sbrk can be intermingled with memory alloacted with mmap (The sbrk heap could become discontinuous because a mmaped region or a shared object obstructs the growth of the heap).

That blog post doesn't see the forest for the trees; only the malloc implementation is allowed to call sbrk with a nonzero argument. More precisely, most malloc implementations for Unix will stop functioning correctly (and by that I mean "your program will crash") if application code calls sbrk with a nonzero argument. If you want to make a large allocation directly from the OS you must use mmap to do it.
(It is true that in a multi-threaded program, malloc must internally wrap a mutex around its calls to sbrk, but that's an implementation detail. POSIX says malloc is thread safe, that's the important thing for an application programmer.)
mmap will not allocate memory overlapping the brk area unless you use MAP_FIXED. If you use MAP_FIXED and your program blows up you get to keep all the pieces.
The kernel tries to avoid doing it, but mmap in normal operation could conceivably allocate memory close to the top of the brk area. If this happens, a subsequent sbrk call that would collide with the mmap region will fail. It will not allocate discontiguous memory. Good implementations of malloc ought to detect this condition and start using mmap for everything. I have not actually tried it, but a test program would be pretty easy to write.

is this linearity always maintained, or for example, an mmap call can allocate memory overlapping the data segment?
Observed behavior is that the brk area is always linear. Implementation details: If enlarging the brk area is not possible, for example due to a blocking mapping, glibc will switch to mmap-only. Small allocations (<128KB) seem to be obtained by glibc via brk if possible, so blocking that with:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/mman.h>
int main(void)
{
int i;
for (i = 0; i < 1024; ++i) {
malloc(2048);
if (i == 512) {
void *r, *end = sbrk(0);
r = mmap(end, 4096, PROT_NONE,
MAP_PRIVATE|MAP_ANONYMOUS, 0, 0);
}
}
}
when straced, yields indeed
[...]
brk(0x1e7d000) = 0x1e7d000
mmap(0x1e7d000, 4096, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS, 0, 0) = 0x1e7d000
brk(0x1e9e000) = 0x1e7d000 <-- (!)
mmap(NULL, 1048576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fbfd9bc9000

Create/initialize objects in shared memory (opened by mmap())

I created my shared memory and mapped my object with following code:
shmfd = shm_open(SHMOBJ_PATH, O_CREAT | O_EXCL | O_RDWR, S_IRWXU | S_IRWXG);
ftruncate(shmfd, shared_seg_size);
bbuffer = (boundedBuffer *)mmap(NULL, shared_seg_size, PROT_READ | PROT_WRITE, MAP_SHARED, shmfd, 0);
Now I need to initialize and add/remove items to/from bbuffer. When I try to add/remove, I get Segmentation Fault: 11, which indicates the program accessed a memory location that was not assigned. What can I do to solve this problem?

A wild guess:
perhaps you don't have the header file for mmap included
and you are on an architecture with 64 bit void* and 32 bit int
What could happen there is that the compiler takes mmap as returning int by default, casts this into a pointer and by that clashes the higher order bits.
Never cast return values from functions such as malloc or mmap, and always take the warnings of your compiler seriously.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight