Calculating a Process's Memory Usage - c

I have a pointer to a process that is running. I want to know how much of the total physical memory is that process taking up.
I tried this but I am getting 0 as a return value.
unsigned long mem_usage(struct task_struct *process)
{
return process->mm->total_vm/2048 * 100000; // this is wrong vm means virtual memory.
}
process->mm->total_vm returns bytes right? Is there an easier way to calculate this?

According to mm_types.h
unsigned long total_vm; /* Total pages mapped */
is the size in pages.
This means, if you want the size in bytes, you have to convert the pages to bytes
total_vm << PAGE_SHIFT
Update:
The reverse way, converting bytes to pages, is
pages = bytes >> PAGE_SHIFT;
But this works only for full pages. If bytes is some pages plus some remaining bytes, you must either increment the number of pages or
pages = (bytes + PAGE_SIZE - 1) >> PAGE_SHIFT;
For just 2 GiB this would be
pages_2gb = (2 * 1024 * 1024 * 1024) >> PAGE_SHIFT;

Related

Offset for mmap() must be page aligned [duplicate]

I came across following algorithm that aligns virtual address to immediate next page bounday.
VirtualAddr = (VirtualAddr & ~(PageSize-1));
Also, given a length of bytes aligns length (rounds it) to be on the page boundary
len = ((PageSize-1)&len) ? ((len+PageSize) & ~(PageSize-1)):len;
I am finding it hard to decipher how this works.
Can someone help me out to break it down?
Those calculations assume that the page size is a power of 2 (which is the case for
all systems that I know of), for example
PageSize = 4096 = 2^12 = 1000000000000 (binary)
Then (written as binary numbers)
PageSize-1 = 00...00111111111111
~(PageSize-1) = 11...11000000000000
which means that
(VirtualAddr & ~(PageSize-1))
is VirtualAddr with the lower 12 bits set to zero or, in other words,
VirtualAddr rounded down to the next multiple of 2^12 = PageSize.
Now you can (hopefully) see that in
len = ((PageSize-1)&len) ? ((len+PageSize) & ~(PageSize-1)):len;
the first expression
((PageSize-1)&len)
is zero exactly if len is a multiple of PageSize. In that case, len is left
unchanged. Otherwise (len + PageSize) is rounded down to the next multiple of
PageSize.
So in any case, len is rounded up to the next multiple of PageSize.
I think the first one should be
VirtualAddr = (VirtualAddr & ~(PageSize-1)) + PageSize;
This one-liner will do it - if it is already aligned aligned it will not skip to the next page boundary:
aligned = ((unsigned long) a & (getpagesize()-1)) ? (void *) (((unsigned long) a+getpagesize()) & ~(getpagesize()-1)) : a;
This one-liner will do it - if it is already aligned aligned it will not skip to the next page boundary:
if you really do want to skip to the next page boundary even if it's already aligned - just do:
aligned = (void *) (((unsigned long) a+getpagesize()) & ~(getpagesize()-1))
This should avoid all compiler warnings, too.
getpagesize() is a POSIX thing. #include <unistd.h> to avoid warnings.

Redis source code, (size&(sizeof(long)-1)) in the zmalloc.c

I am learning the Redis source code , and in the zmalloc.c,
size_t zmalloc_size(void *ptr) {
void *realptr = (char*)ptr-PREFIX_SIZE;
size_t size = *((size_t*)realptr);
/* Assume at least that all the allocations are padded at sizeof(long) by
* the underlying allocator. */
if (size&(sizeof(long)-1)) size += sizeof(long)-(size&(sizeof(long)-1));
return size+PREFIX_SIZE;
}
I am confused with
if (size&(sizeof(long)-1)) size += sizeof(long)-(size&(sizeof(long)-1));
what's the effect of it? Memory padding?Then why sizeof(long)?
Yes, it seems to be to include the memory padding with the assumption that all allocations are padded at the sizeof(long) (as said by the comment).
Pseudo-code example:
size = 6 // as an example
sizeof(long) == 4
size & (sizeof(long) - 1) == 6 & (4 - 1) == 6 & 3 == 2
size += 4 - 2
size == 8 // two bytes of padding included
I'm pretty fresh in C though so you should probably not take my word for it. I'm not sure why one can assume that the underlying allocator will align at the size of long, perhaps it's only a decent approximation that is sufficient for zmalloc_size's use-case.

Determine cache miss rate for a code snippet

I am preparing for an upcoming exam and I was having trouble with this problem:
direct mapped cache of size 64K with block size 16 bytes. Cache starts empty
What is the cache miss rate if...
ROWS = 128, COLS = 128
ROWS = 128 and COLS = 192
ROWS = 128 and COLS = 256
[solution: page 5 http://www.inf.ethz.ch/personal/markusp/teaching/263-2300-ETH-spring11/midterm/midterm.pdf ]
I was confused about how they got "the cache stores 128 x 128 elements". I thought the cache size was 64K (2^16).
Also, can someone explain how to approach each question? My professor had some formula to calculate the number of accesses in each block: block size/stride, but it doesn't seem to work here.
As far as I understand; in case 1, both src and dst matrices are of 64kb size (128 * 128 * 4 bytes); since the cache is directly mapped and has a size of 64kb; the entries of src & dst of the same indexes will have to be mapped to the same location in the cache (since (0+i mod)64 = (64+i mod)64) at the same time to be used in the line
dest[i][j]=src[i][j]
Therefore you have 100% miss rate; The same is applied to case 3 since the new size is a multiple of 64kb (128 * 256 * 4), so it doesn't make any difference;
But for case 2; the size of the matrices becomes 96 kb (128 * 192 *4 bytes); so now both src & dst may be loaded at the same time and you will have a lower miss rate.

Linux: buddy system free memory

Could anyone explain this code?
page_idx = page_to_pfn(page) & ((1 << MAX_ORDER) - 1);
page_to_pfn() have already return the page_idx, so what does '&' use for? Or page_to_pfn() return something else?
You need to know that x & ((1 << n) - 1) is a trick meaning x % ((int) pow(2, n)). Often it's faster (but it's better to leave these kind of optimizations to the compiler).
So in this case what this does it does a modulo by pow(2, MAX_ORDER). This causes a wrap-around; if page_idx is larger than pow(2, MAX_ORDER) it will go back to 0. Here is equivalent, but more readable code:
const int MAX_ORDER_N = (int) pow(2, MAX_ORDER);
page_idx = page_to_pfn(page);
/* wraparound */
while (page_idx > MAX_ORDER_N) {
page_idx -= MAX_ORDER_N;
}
It's a bit mask that ensures that page_idx does not exceed a certain value (2^MAX_ORDER).
# define MAX_ORDER (8)
(1 << MAX_ORDER) /* 100000000 */
- 1 /* flip bits, same as ~(…) due to two-complement: 11111111 */
So you only have the eight least significant bits left
1010010101001
& 0000011111111
= 0000010101001
chekck this function will be clear:
static inline struct page *
__page_find_buddy(struct page *page, unsigned long page_idx, unsigned int order)
{
unsigned long buddy_idx = page_idx ^ (1 << order);
return page + (buddy_idx - page_idx);
}
it just limits page_idx into a range of 8MB, maybe because the maximum block size is 4MB (1024 pages), it can not be merged again, only 2MB blocks can merge into 4MB, and the buddy block can be before or after the page, so
the whole range is [page_idx - 2MB, page_idx + 2MB] ??
its absolute size is not important, but offset (buddy_idx - page_idx) is important, add page to get the real buddy address.

mprotect - how aligning to multiple of pagesize works?

I am not understanding the 'aligning allocated memory' part from the mprotect usage.
I am referring to the code example given on http://linux.die.net/man/2/mprotect
char *p;
char c;
/* Allocate a buffer; it will have the default
protection of PROT_READ|PROT_WRITE. */
p = malloc(1024+PAGESIZE-1);
if (!p) {
perror("Couldn't malloc(1024)");
exit(errno);
}
/* Align to a multiple of PAGESIZE, assumed to be a power of two */
p = (char *)(((int) p + PAGESIZE-1) & ~(PAGESIZE-1));
c = p[666]; /* Read; ok */
p[666] = 42; /* Write; ok */
/* Mark the buffer read-only. */
if (mprotect(p, 1024, PROT_READ)) {
perror("Couldn't mprotect");
exit(errno);
}
For my understanding, I tried using a PAGESIZE of 16, and 0010 as address of p.
I ended up getting 0001 as the result of (((int) p + PAGESIZE-1) & ~(PAGESIZE-1)).
Could you please clarify how this whole 'alignment' works?
Thanks,
Assuming that PAGESIZE is a power of 2 (a requirement), an integral value x can be rounded down to a multiple of PAGESIZE with (x & ~(PAGESIZE-1)). Similarly, ((x + PAGESIZE-1) & ~(PAGESIZE-1)) will result in x rounded up to a multiple of PAGESIZE.
For example, if PAGESIZE is 16, then in binary with a 32-bit word:
00000000000000000000000000010000 PAGESIZE
00000000000000000000000000001111 PAGESIZE-1
11111111111111111111111111110000 ~(PAGESIZE-1)
A bitwise-and (&) with the above value will clear the low 4 bits of the value, making it a multiple of 16.
That said, the code quoted in the description is from an old version of the manual page, and is not good because it wastes memory and does not work on 64-bit systems. It is better to use posix_memalign() or memalign() to obtain memory that is already properly aligned. The example on the current version of the mprotect() manual page uses memalign(). The advantage of posix_memalign() is that it is part of the POSIX standard, and does not have different behavior on different systems like the older non-standard memalign().

Resources