Is kmalloc allocation not virtually contiguous?

Is kmalloc allocation not virtually contiguous? - c

I found that kmalloc returns physically and virtually contiguous memory.
I wrote some code to observe the behavior, but only the physical memory seems to be contiguous and not the virtual. Am I making any mistake?
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/slab.h>
#include <linux/moduleparam.h>
MODULE_LICENSE("GPL");
static char *ptr;
int alloc_size = 1024;
module_param(alloc_size, int, 0);
static int test_hello_init(void)
{
ptr = kmalloc(alloc_size,GFP_ATOMIC);
if(!ptr) {
/* handle error */
pr_err("memory allocation failed\n");
return -ENOMEM;
} else {
pr_info("Memory allocated successfully:%p\t%p\n", ptr, ptr+100);
pr_info("Physical address:%llx\t %llx\n", virt_to_phys(ptr), virt_to_phys(ptr+100));
}
return 0;
}
static void test_hello_exit(void)
{
kfree(ptr);
pr_info("Memory freed\n");
}
module_init(test_hello_init);
module_exit(test_hello_exit);
dmesg output:
Memory allocated successfully:0000000083318b28 000000001fba1614
Physical address:1d5d09c00 1d5d09c64

Printing kernel pointers is in general a bad idea, because it basically means leaking kernel addresses to user space, so when using %p in printk() (or similar macros like pr_info() etc.), the kernel tries to protect itself and does not print the real address. Instead, it prints a different hashed unique identifier for that address.
If you really want to print that address, you can use %px.
From Documentation/core-api/printk-formats.rst (web version, git):
Pointer Types
Pointers printed without a specifier extension (i.e unadorned %p) are
hashed to give a unique identifier without leaking kernel addresses to user
space. On 64 bit machines the first 32 bits are zeroed. If you really
want the address see %px below.
%p abcdef12 or 00000000abcdef12
Then, later below:
Unmodified Addresses
%px 01234567 or 0123456789abcdef
For printing pointers when you really want to print the address. Please
consider whether or not you are leaking sensitive information about the
Kernel layout in memory before printing pointers with %px. %px is
functionally equivalent to %lx. %px is preferred to %lx because it is more
uniquely grep'able. If, in the future, we need to modify the way the Kernel
handles printing pointers it will be nice to be able to find the call
sites.

Related

Why does setting a value at an arbitrary memory location not work?

I have this code:
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <inttypes.h>
int main (int argc, char** argv) {
*(volatile uint8_t*)0x12345678u = 1;
int var = *(volatile uint8_t*)0x12345678;
printf("%i", var);
printf("%i", &var);
return (EXIT_SUCCESS);
}
I want to see a 1 and the address of that int, which i specified previously. But when compiled by gcc in bash, only "command terminated" without any error will be shown. Does anyone know why so?
PS: I am newbie to C, so just experimenting.

What you are doing:
*(volatile uint8_t*)0x12345678u = 1;
int var = *(volatile uint8_t*)0x12345678;
is totally wrong.
You have no guarantee whatsoever that an arbitrary address like 0x12345678 will be accessible, not to mention writable by your program. In other words, you cannot set a value to an arbitrary address and expect it to work. It's undefined behavior to say the least, and will most likely crash your program due to the operating system stopping you from touching memory you don't own.
The "command terminated" that you get when trying to run your program happens exactly because the operating system is preventing your program from accessing a memory location it is not allowed to access. Your program gets killed before it can do anything.
If you are on Linux, you can use the mmap function to request a memory page at an (almost) arbitrary address before accessing it (see man mmap). Here's an example program which achieves what you want:
#include <sys/mman.h>
#include <stdio.h>
#define WANTED_ADDRESS (void *)0x12345000
#define WANTED_OFFSET 0x678 // 0x12345000 + 0x678 = 0x12345678
int main(void) {
// Request a memory page starting at 0x12345000 of 0x1000 (4096) bytes.
unsigned char *mem = mmap(WANTED_ADDRESS, 0x1000, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
// Check if the OS correctly granted your program the requested page.
if (mem != WANTED_ADDRESS) {
perror("mmap failed");
return 1;
}
// Get a pointer inside that page.
int *ptr = (int *)(mem + WANTED_OFFSET); // 0x12345678
// Write to it.
*ptr = 123;
// Inspect the results.
printf("Value : %d\n", *ptr);
printf("Address: %p\n", ptr);
return 0;
}

The operating system and loader do not automatically make every possible address available to your program. The virtual address space of your process is constructed on demand by various operations of the program loader and of services inside the process. Although every address “exists” in the sense of being a potential address of memory, what happens when a process attempts to access an address is controlled by special data structures in the system. Those data structures control whether a process can read, write, or execute various portions of memory, whether the virtual addresses are currently mapped to physical memory, and whether the virtual addresses are not currently mapped to memory but will be provide with physical memory when needed. Initially, much of a process’ address space is marked not in use (or at least implicitly marked, in that none of the explicit records for the address space apply to it).
In the executions of your program you have attempted so far, the address 0x12345678 has not been mapped and marked available to your process, so, when your process attempted to use it, the system detected a fault and terminated your process.
(Some systems randomize the layout of the address space when a program is being loaded, to make it harder for an attacker to exploit bugs in a program. Because of this, it is possible that 0x12345678 will be accessible in some executions of your program and not others.)

The quote from C11 standard 6.5.3.2p4:
4 The unary * operator denotes indirection. [...] If an invalid value has been assigned to the pointer, the behavior of the unary * operator is undefined.
You use * operator on (volatile uint8_t*)0x12345678u pointer. Is this a valid pointer? Is it invalid pointer? What is an "invalid value" of a pointer?
There is no check that allows to find out which particilar pointer values are valid, which aren't. It is not implemented in C language. A random pointer may just happen to be a valid pointer. But most, most probably it is an invalid pointer. In which case - the behavior is undefined.
Dereferencing an invalid pointer is undefined behavior. But - outside of C scope and into operating system - on *unix systems trying to access memory that you are not allowed to, should raise a signal SIGSEGV on your program and terminate your program. Most probably this is what happens. Your program is not allowed to access memory location that is behind 0x12345678 value, the operating system specifically protects against that.
Also note, that systems use ASLR, so that pointer values within your program are indeed in some degree random. There are not linear, ie. *(char*)0x01 will not access the first byte in your ram. Operating system (or more exact, the underlying hardware as configured by the operating system) translates pointer values in your program to physical location in ram using what is called virtual memory. The same pointer values may just happen to be valid on the second run of your program. But most probably, because pointers can have so many values, most probably it isn't a valid pointer. Your operating system kills your program, as it detects an invalid memory access.

Memcpy with function pointers leads to a segfault

I know I can just copy the function by reference, but I want to understand what's going on in the following code that produces a segfault.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int return0()
{
return 0;
}
int main()
{
int (*r0c)(void) = malloc(100);
memcpy(r0c, return0, 100);
printf("Address of r0c is: %x\n", r0c);
printf("copied is: %d\n", (*r0c)());
return 0;
}
Here's my mental model of what I thought should work.
The process owns the memory allocated to r0c. We are copying the data from the data segment corresponding to return0, and the copy is successful.
I thought that dereferencing a function pointer is the same as calling the data segment that the function pointer points to. If that's the case, then the instruction pointer should move to the data segment corresponding to r0c, which will contain the instructions for function return0. The binary code corresponding to return0 doesn't contain any jumps or function calls that would depend on the address of return0, so it should just return 0 and restore ip... 100 bytes is certainly enough for the function pointer, and 0xc3 is well within the bounds of r0c (it is at byte 11).
So why the segmentation fault? Is this a misunderstanding of the semantics of C's function pointers or is there some security feature that prevents self-modifying code that I'm unaware of?

The memory pages used by malloc to allocate memory are not marked as executable. You can't copy code to the heap and expect it to run.
If you want to do something like that you have to go deeper into the operating system, and allocate pages yourself. Then you need to mark those as executable. You would most likely need administrator rights to be able to set the executable flag on memory pages.
And it's really dangerous. If you do this in a program you distribute and have some kind of bug that lets an attacker use our program to write to those allocated memory pages, then the attacker can gain administrator rights and take control of the computer.
There's also other problems with your code, like pointers to functions might not translate well into general pointers on all platforms. It's very hard (not to mention non-standard) to predict or otherwise get the size of a function. You also print out pointers wrong in your code example. (use the "%p" format to print a void *, casting the pointer to a void * is needed).
Also when you declare a function like int fun() that's not the same as declaring a function that takes no arguments. If you want to declare a function that takes no arguments you should explicitly use void as in int fun(void).

The standard says:
The memcpy function copies n characters from the object pointed to by s2 into the object pointed to by s1.
[C2011, 7.24.2.1/2; emphasis added]
In the standard's terminology, functions are not "objects". The standard does not define behavior for the case where the source pointer points to a function, therefore such a memcpy() call produces undefined behavior.
Additionally, the pointer returned by malloc() is an object pointer. C does not provide for direct conversion of object pointers to function pointers, and it does not provide for objects to be called as functions. It is possible to convert between object pointer and function pointer by means of an intermediate integer value, but the effect of doing so is at minimum doubly implementation-defined. Under some circumstances it is undefined.
As in other cases, UB can turn out to be precisely the behavior you hoped for, but it is not safe to rely on that. In this particular case, other answers present good reasons to not expect to get the behavior you hoped for.

As was said in some comments, you need to make the data executable. This requires communicating with the OS to change protections on the data. On Linux, this is the system call int mprotect(void* addr, size_t len, int prot) (see http://man7.org/linux/man-pages/man2/mprotect.2.html).
Here is a Windows solution using VirtualProtect.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdint.h>
#ifdef _WIN32
#include <Windows.h>
#endif
int return0()
{
return 0;
}
int main()
{
int (*r0c)(void) = malloc(100);
memcpy((void*) r0c, (void*) return0, 100);
printf("Address of r0c is: %p\n", (void*) r0c);
#ifdef _WIN32
long unsigned int out_protect;
if(!VirtualProtect((void*) r0c, 100, PAGE_EXECUTE_READWRITE, &out_protect)){
puts("Failed to mark r0c as executable");
exit(1);
}
#endif
printf("copied is: %d\n", (*r0c)());
return 0;
}
And it works.

Malloc returns a pointer to an allocated memory (100 bytes in your case). This memory area is uninitialized; assuming that memory could be executed by the CPU, for your code to work, you would have to fill those 100 bytes with the executable instructions that the function implements (if indeed it can be held in 100 bytes). But as has been pointed out, your allocation is on the heap, not in the text (program) segment and I don't think it can be executed as instructions. Perhaps this would achieve what it is you want:
int return0()
{
return 0;
}
typedef int (*r0c)(void);
int main(void)
{
r0c pf = return0;
printf("Address of r0c is: %x\n", pf);
printf("copied is: %d\n", pf());
return 0;
}

malloc() 5GB memory on a 32 bit machine

I was reading in a book:
The virtual address space of a process on a 32 bit machine is 2^32 i.e. 4Gb of space. And every address seen in the program is a virtual address. The 4GB of space is further goes through user/kernel split 3-1GB.
To better understand this, I did malloc() of 5Gb space and tried to print the all addresses. If I print the addresses, How is the application going to print whole 5Gb address when It has only 3GB of virtual address space? Am I missing something here?

malloc() takes size_t as an argument. On 32 bit system it's an alias to some unsigned 32 bit integer type. This means that you just cannot pass any value bigger than 2^32-1 as an argument for malloc() making it impossible request allocation of more than 4GB of memory using this function.
The same is true for all other functions that can be used to allocate memory. Ultimately they all end up as either brk() or mmap syscall. The length argument of mmap() is also of type ssize_t an in case of brk() you have to provide a pointer for the new end of your allocated space. The pointer is again 32 bit.
So there is absolutely no way to tell kernel you would like to get more than 4GB of memory allocated with one call) And it's not an accident - this just wouldn't make any sense anyway.
Now it's true that you could do several calls to malloc or other function that allocates memory, requesting more than 4GB in total. If you try this, the subsequent call (that would cause extending allocated memory to more than 3GB) will fail as there is just no address space available.
So I guess that you either didn't check the malloc return value or you did try to run code like this (or something similar):
int main() {
assert(malloc(5*1<<30));
}
and assumed that you succeeded in allocating 5GB without verifying that your argument overflowed and instead of requesting 5368709120 bytes, you requested 1073741824. One example to verify this on Linux is to use:
$ ltrace ./a.out
__libc_start_main(0x804844c, 1, 0xbfbcea74, 0x80484a0, 0x8048490 <unfinished ...>
malloc(1073741824) = 0x77746008
$

There's already a good answer. Just in case, the size of your virtual address space is easily verifiable like this:
#include <stdlib.h>
#include <stdio.h>
int main()
{
size_t size = (size_t)-1L;
void *foo;
printf("trying to allocate %zu bytes\n", size);
if (!(foo = malloc(size)))
{
perror("malloc()");
}
else
{
free(foo);
}
}
> gcc -m32 -omalloc malloc.c && ./malloc
trying to allocate 4294967295 bytes
malloc(): Cannot allocate memory
This must fail because parts of your address space are already occupied: by the mapped part of the kernel, by mapped shared libraries and by your program, of course.

You cannot do this because there is no function for you to alloc 5GB memory.

Does malloc() use brk() or mmap()?

c code:
// program break mechanism
// TLPI exercise 7-1
#include <stdio.h>
#include <stdlib.h>
void program_break_test() {
printf("%10p\n", sbrk(0));
char *bl = malloc(1024 * 1024);
printf("%x\n", sbrk(0));
free(bl);
printf("%x\n", sbrk(0));
}
int main(int argc, char **argv) {
program_break_test();
return 0;
}
When compiling following code:
printf("%10p\n", sbrk(0));
I get warning tip:
format ‘%p’ expects argument of type ‘void *’, but argument 2 has type ‘int’
Question 1: Why is that?
And after I malloc(1024 * 1024), it seems the program break didn't change.
Here is the output:
9b12000
9b12000
9b12000
Question 2: Does the process allocate memory on heap when start for future use? Or the compiler change the time point to allocate? Otherwise, why?
[update] Summary: brk() or mmap()
After reviewing TLPI and check man page (with help from author of TLPI), now I understand how malloc() decide to use brk() or mmap(), as following:
mallopt() could set parameters to control behavior of malloc(), and there is a parameter named M_MMAP_THRESHOLD, in general:
If requested memory is less than it, brk() will be used;
If requested memory is larger than or equals to it, mmap() will be used;
The default value of the parameter is 128kb (on my system), but in my testing program I used 1Mb, so mmap() was chosen, when I changed requested memory to 32kb, I saw brk() would be used.
The book mentioned that in TLPI page 147 and 1035, but I didn't read carefully of that part.
Detailed info of the parameter could be found in man page for mallopt().

If we change the program to see where the malloc'd memory is:
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
void program_break_test() {
printf("%10p\n", sbrk(0));
char *bl = malloc(1024 * 1024);
printf("%10p\n", sbrk(0));
printf("malloc'd at: %10p\n", bl);
free(bl);
printf("%10p\n", sbrk(0));
}
int main(int argc, char **argv) {
program_break_test();
return 0;
}
It's perhaps a bit clearer that sbrk wouldn't change. The memory given to us by malloc is being mapped into a wildly different location.
You could also use strace on Linux to see what system calls are made, and find out that malloc is using mmap to perform the allocation.

malloc is not limited to using sbrk to allocate memory. It might, for example, use mmap to map a large MAP_ANONYMOUS block of memory; normally mmap will assign a virtual address well away from the data segment.
There are other possibilities, too. In particular, malloc, being a core part of the standard library, is not itself limited to standard library functions; it can make use of operating-system-specific interfaces.

If you use malloc in your code, it will call brk() at the beginning, allocated 0x21000 bytes from the heap, that's the address you printed, so the Question 1: the following mallocs requirements can be meet from the pre-allocated space, so these mallocs actually didn't call brk, it is a optimization in malloc. If next time you want to malloc size beyond that boundary, a new brk will be called (if not large than the mmap threshold).

Pointer address changed using Malloc

Here's the code snippet:
void main() {
int i,*s;
for(i=1;i<=4;i++) {
s=malloc(sizeof(int));
printf("%lu \n",(unsigned long)s);
}
}
The size of int on my comp is 2 bytes, so shouldn't the printf command print address incremented by 16 bits, instead it prints the address as:
2215224120
2215224128
2215224136...
Why is this so?

How memory managed is entirely up to your operating system. It could allocate memory from all over the place, you can absolutely make no assumptions as to where the memory will be.
Most memory allocators also have some overhead, so even a simple 2-byte allocation might take up 8 bytes or more. Besides, addresses might need to be aligned for several reasons (like performance, and because some CPUs even crash when reading from unaligned addresses).
Bottom line - take the return value from malloc as it is, don't make any guesses or assumptions.

Its called alignment. Most CPUs have to align memory on some boundary, and its commonly 4 or 8. If you mis-align an address you will get a segfault or bus error.

malloc() does not provide any such guarantees. It just allocates some memory according to its own memory management decisions and returns you a pointer to that. In fact, many implementations use extra memory right before the pointer returned for memory management metadata.

malloc() gives you an abstraction on the underlying hardware, OS, drivers, etc. The memory allocation pattern may differ from machine to machine due to various parameters.
But the following are few things that always stays right about malloc()
The malloc() function allocates size bytes and returns a pointer to the allocated memory.
The memory is not initialized.
If size is 0,then malloc() returns either NULL, or a unique pointer value that can later be successfully passed to free().
The malloc() returns a pointer to the allocated memory that is suitably aligned for any kind of variable. On error, it returns NULL.
NULL may also be returned by a successful call to malloc() with a size of zero
On a side note, you can use %p format specifier for printing the pointers
I modified the program as follows
#include <stdlib.h>
int main(void) {
int i,*s;
printf("sizeof(int) = %zu \n", sizeof(int));
for(i=1;i<=4;i++) {
if ((s=malloc(sizeof(int))) == NULL) {
printf("unable to allocate memory \n");
return -1;
}
printf("%p \n",s);
}
return 0;
}
The output is as follows:
$ ./a.out
sizeof(int) = 4
0x9d5a008
0x9d5a018
0x9d5a028
0x9d5a038
$

You have no guarantees whatsoever about the pattern of addresses malloc returns to you.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight