kernel crash with kmalloc

kernel crash with kmalloc - c

I am trying to assign memory using kmalloc in kernel code in fact in a queueing discipline. I want to assign memory to q->agg_queue_hdr of which q is a queueing discipline and agg_queue_hdr is a struct, so if assign memory like this:
q->agg_queue_hdr=kmalloc(sizeof(struct agg_queue), GFP_ATOMIC);
the kernel crashes. Based on the examples of kmalloc I saw from searching, I now changed it to:
agg_queue_hdr=kmalloc(sizeof(struct agg_queue), GFP_ATOMIC);
with which the kernel doesn't crash. Now I want to know how can I assign memory to the pointer q->agg_queue_hdr?

Make sure q is pointed to a valid area of memory. Then you should be able to assign q->agg_queue_hdr like you had it to begin with.

Why don't you modify your code with below way, which would avoid kernel panic.
if (q->agg_queue_hdr) {
q->agg_queue_hdr = kmalloc(sizeof(struct agg_queue), GFP_ATOMIC);
}
else {
printk("[+] q->agg_queue_hdr invalid \n");
dump_stack(); // print callstack in the kernel log.
}
When disassembing "q->agg_queue_hdr", "ldr" instruction will works where kernel panic occurs.

Related

How do userspace programs pass memory back to the kernel after free()?

I've been reading a lot about memory allocation on the heap and how certain heap management allocators do it.
Say I have the following program:
#include<stdlib.h>
#include<stdio.h>
#include<unistd.h>
int main(int argc, char *argv[]) {
// allocate 4 gigabytes of RAM
void *much_mems = malloc(4294967296);
// sleep for 10 minutes
sleep(600);
// free teh ram
free(*much_mems);
// sleep some moar
sleep(600);
return 0;
}
Let's say for sake of argument that my compiler doesn't optimize out anything above, that I can actually allocate 4GiB of RAM, that the malloc() call returns an actual pointer and not NULL, that size_t can hold an integer as big as 4294967296 on my given platform, that the allocater implemented by the malloc call actually does allocate that amount of RAM in the heap. Pretend that the above code does exactly what it looks like it will do.
After the call to free executes, how does the kernel know that those 4 GiB of RAM are now eligible for use for other processes and for the kernel itself? I'm not assuming the kernel is Linux, but that would be a good example. Before the call to free, this process has a heap size of at least 4GiB, and afterward, does it still have that heap size?
How do modern operating systems allow userspace programs to return memory back to kernel space? Do free implementations execute a syscall to the kernel (or many syscalls) to tell it which areas of memory are now available? And is it possible that my 4 GiB allocation will be non-contiguous?

Do free implementations execute a syscall to the kernel (or many syscalls) to tell it which areas of memory are now available?
Yes.
A modern implementation of malloc on Linux will call mmap to allocate a large amount of memory. The kernel will find an unused virtual address, mark it as allocated, and return it. (The kernel may also return an error if there isn't enough free memory)
free would then call munmap to deallocate the memory, passing the address and size of the allocation.
On Windows, malloc will call VirtualAlloc and free will call VirtualFree.

On GNU/Linux with Glibc, large memory allocations, of more than a few hundred kilobytes, are handled by calling mmap. When the free function is invoked on this, the library knows that the memory was allocated this way (thanks to meta-data stored in a header). It simply calls unmap on it to release it. That's how the kernel knows; its mmap and unmap API is being used.
You can see these calls if you run strace on the program.
The kernel keeps track of all mmap-ed regions using a red-black tree. Given an arbitrary virtual address, it can quickly determine whether it lands in the mmap area, and which mapping, by performing a tree walk.

Before the call to free, this process has a heap size of at least 4GiB...
The C language does not define either "heap" or "stack". Before the call to free, this process has a chunk of 4 GB dynamically allocated memory...
and afterward, does it still have that heap size?
...and after the free(), access to that memory would be undefined behaviour, so for practical purposes, that dynamically allocated memory is no longer "there".
What the library does "under the hood" (e.g. caching, see below) is up to the library, and is subject to change without further notice. This could change with the amount of available physical memory, system load, runtime parameters, ...
How do modern operating systems allow userspace programs to return memory back to kernel space?
It's up to the standard library's implementation to decide (which, of course, has to talk to the operating system to actually, physically allocate / free memory).
Others have pointed out how certain, existing implementations do it. Other libraries, operating systems, and environments exist.
Do free implementations execute a syscall to the kernel (or many syscalls) to tell it which areas of memory are now available?
Possibly. A common optimization done by library implementations is to "cache" free()d memory, so subsequent malloc() calls can be served without talking to the kernel (which is a costly operation). When, how much, and how long memory is cached this way is, you guessed it, implementation-defined.
And is it possible that my 4 GiB allocation will be non-contiguous?
The process will always "see" contiguous memory. In a system supporting virtual memory (i.e. "modern" desktop OS's like Linux or Windows), the physical memory might be non-contiguous, but the virtual addresses your process gets to see will be contiguous (or the malloc() would have failed if this requirement could not be serviced).
Again, other systems exist. You might be looking at a system that doesn't virtualize addresses (i.e. gives physical addresses to the process). You might be looking at a system that assigns a given amount of memory to a process on startup, serves any malloc() requests from that, and doesn't support the allocation of additional memory. And so on.

If we're using Linux as an example it uses mmap to allocate large chunks of memory. This means when you free it it gets umapped ie the kernel gets told that it can now unmap this memory. Read up on the brk and sbrk system calls. A good place to start would be here...
What does brk( ) system call do?
and here. The following post discusses how malloc is implemented which will give you a good idea what's happening under the covers...
How is malloc() implemented internally?
Doug Lea's malloc can be found here. It's well commented and public domain...
ftp://g.oswego.edu/pub/misc/malloc.c

malloc() and free() are kernel functions (system calls) . it is being called by the application to allocate and free memory on the heap.
application itself is not allocating/freeing memory .
the whole mechanism is executed at kernel level .
see the below heap implementation code
void *heap_alloc(uint32_t nbytes) {
heap_header *p, *prev_p; // used to keep track of the current unit
unsigned int nunits; // this is the number of "allocation units" needed by nbytes of memory
nunits = (nbytes + sizeof(heap_header) - 1) / sizeof(heap_header) + 1; // see how much we will need to allocate for this call
// check to see if the list has been created yet; start it if not
if ((prev_p = _heap_free) == NULL) {
_heap_base.s.next = _heap_free = prev_p = &_heap_base; // point at the base of the memory
_heap_base.s.alloc_sz = 0; // and set it's allocation size to zero
}
// now enter a for loop to find a block fo memory
for (p = prev_p->s.next;; prev_p = p, p = p->s.next) {
// did we find a big enough block?
if (p->s.alloc_sz >= nunits) {
// the block is exact length
if (p->s.alloc_sz == nunits)
prev_p->s.next = p->s.next;
// the block needs to be cut
else {
p->s.alloc_sz -= nunits;
p += p->s.alloc_sz;
p->s.alloc_sz = nunits;
}
_heap_free = prev_p;
return (void *)(p + 1);
}
// not enough space!! Try to get more from the kernel
if (p == _heap_free) {
// if the kernel has no more memory, return error!
if ((p = morecore()) == NULL)
return NULL;
}
}
}
this heap_alloc function uses morecore function which is implemented as below :
heap_header *morecore() {
char *cp;
heap_header *up;
cp = (char *)pmmngr_alloc_block(); // allocate more memory for the heap
// if cp is null we have no memory left
if (cp == NULL)
return NULL;
//vmmngr_mapPhysicalAddress(cp, (void *)_virt_addr); // and map it's virtual address to it's physical address
vmmngr_mapPhysicalAddress(vmmngr_get_directory(), _virt_addr, (uint32_t)cp, I86_PTE_PRESENT | I86_PTE_WRITABLE);
_virt_addr += BLOCK_SIZE; // tack on nu bytes to the virtual address; this will be our next allocation address
up = (heap_header *)cp;
up->s.alloc_sz = BLOCK_SIZE;
heap_free((void *)(up + 1));
return _heap_free;
}
as you can see this function is asking the physical memory manager to allocate a block :
cp = (char *)pmmngr_alloc_block();
and then map the allocated block into virtual memory :
vmmngr_mapPhysicalAddress(vmmngr_get_directory(), _virt_addr, (uint32_t)cp, I86_PTE_PRESENT | I86_PTE_WRITABLE);
as you can see , the whole story is being controlled by the heap manager in kernel level.

Segmentation Fault using getcontext() in thread library

I am trying to implement a user level thread library in C using systems calls such as get context, swap context , etc
I have a thread control block that looks like this :
struct tcb {
int thread_id;
int thread_pri;
ucontext_t *thread_context;
struct tcb *next;
}
And I have a function called init() that looks like this:
void t_init()
{
tcb *tmp;
tmp = malloc(sizeof(tcb));
getcontext(tmp->thread_context); /* let tmp be the context of main() */
running_head = tmp;
}
I used gdb and I got a segmentation fault during runtime at the getcontext(tmp->thread_context) function.
I have read the man pages for getcontext() but am unsure as to why this is returning a segmentation fault to me!
Any suggestions please?

You haven't allocated any space for thread_context, try
void t_init()
{
struct tcb *tmp;
tmp = malloc(sizeof(struct tcb));
if (!tmp)
return -1;
memset(&tmp, 0, sizeof(struct tcb));
tmp->thread_context = malloc(sizeof(ucontext_t));
if (!tmp->thread_context)
return -1;
getcontext(tmp->thread_context);
}

We can get the following information about getcontext/setcontext "The GNU C Library Reference Manual Chapter:23 Non Locals Exits, Page 622)", and found the following
While allocating the memory for the stack one has to be careful.
Most modern processors keep track of whether a certain memory region is allowed to contain code which is executed or not. Data segments and
heap memory is normally not tagged to allow this. The result is that
programs would fail. Examples for such code include the calling
sequences the GNU C compiler generates for calls to nested functions.
Safe ways to allocate stacks correctly include using memory on the
original threads stack or explicitly allocate memory tagged for
execution using memory mapped I/O.
This is causing the problem and you should use the recommended step to allocate the memory(using memory mapped I/O For more information, Please refer the libc manual).

Linux is not allowing me to access a fixed region of memory

I have some data stored in a FLASH memory that I need to access with C pointers to be able to make a non-Linux graphics driver work (I think this requirement is DMA related, not sure). Calling read works, but I don't want to have intermediate RAM buffers between the FLASH and the non-Linux driver.
However, just creating a pointer and storing the address that I want on it is making Linux emit an exception about invalid access on me.
void *ptr = 0xdeadbeef;
int a = *ptr; // invalid access!
What am I missing here? And could someone point me to a material to make this concepts clear for me?
I'm reading about mmap but I'm not sure that this is what I need.

The problem you have is that linux runs your program in a virtual address space. So every address you use directly in the code (like 0xdeadbeef) is a virtual address that gets translated by the memory management unit into a physical address which is not necessarily the same as your virtual address. This allows easy separation of multiple independent processes and other stuff like paging, etc.
The problem is now, that in your case no physical address is mapped to the virtual address 0xdeadbeef causing the kernel to abort execution.
The call mmap you already found asks the kernel to assign a specific file (from a specific offset) to a virtual address of your process. Note that the returning address of mmap could be a completely different address. So don't make any assumptions about the virtual address you get.
Therefore there are examples with mmap and /dev/mem out there where the offset for the memory device is the physical address. After the kernel was able to assign the file from the offset you gave to a virtual address of your process you can access the memory area asif it were a direct access.
After you don't need the area anymore don't forget to munmap the area. Otherwise you'll cause something similar to a memory leak.
One problem with the /dev/mem method is that the user running the process needs access to this device. This could introduce a security issue (e.g. Samsung recently introduced such a security hole in their hand held devices)
A more secure way is the way described in a article i found (The Userspace I/O HOWTO) as you still have control about the memory areas accessable by the user's process.

You need to access the memory differently. Basically you need to open /dev/mem and use mmap(). (as you suggested). Simple example:
int openMem(unsigned int address, unsigned int size)
{
int mmapFD;
int page_size;
unsigned int page_start_address;
/* Minimum page size for the mmapped region. */
mask = size - 1;
/* Get the page size. */
page_size = (int) sysconf(_SC_PAGE_SIZE);
/* We have to map shared memory to beginning of memory page so adjust
* memory address accordingly. */
page_start_address = address - (address % page_size);
/* Open the file that will be mapped. */
if((mmapFD = open("/dev/mem", (O_RDWR | O_SYNC))) == -1)
{
printf("Opening shared memory device failed\n");
return -1;
}
mmap_base_address = mmap(0, size, (PROT_READ|PROT_WRITE), MAP_SHARED, mmapFD, (off_t)page_start_address & ~mask);
if(mmap_base_address == MAP_FAILED)
{
printf("Mapping memory failed\n");
return -1;
}
return 0;
}
unsigned int *getAddress(unsigned int address)
{
unsigned int log_address;
log_address = (int)((off_t)mmap_base_address + ((off_t)address & mask));
return (unsigned int*)log_address;
}
...
result = openMem(address, 0x10000);
if (result < 0)
return result;
target_address = getValue(address);
*(unsigned int*)target_address = value;
This would set "value" to "address".

You need to call ioremap - something like:
void *myaddr = ioremap(0xdeadbeef, size);
where size is the size of your memory region. You probably want to use a page-aligned address for the first argument, e.g. 0xdeadb000 - but I expect your actual device isn't at "0xdeadbeef" anyways.
Edit: The call to ioremap must be done from a driver!

malloc works, cudaHostAlloc segfaults?

I am new to CUDA and I want to use cudaHostAlloc. I was able to isolate my problem to this following code. Using malloc for host allocation works, using cudaHostAlloc results in a segfault, possibly because the area allocated is invalid? When I dump the pointer in both cases it is not null, so cudaHostAlloc returns something...
works
in_h = (int*) malloc(length*sizeof(int)); //works
for (int i = 0;i<length;i++)
in_h[i]=2;
doesn't work
cudaHostAlloc((void**)&in_h,length*sizeof(int),cudaHostAllocDefault);
for (int i = 0;i<length;i++)
in_h[i]=2; //segfaults
Standalone Code
#include <stdio.h>
void checkDevice()
{
cudaDeviceProp info;
int deviceName;
cudaGetDevice(&deviceName);
cudaGetDeviceProperties(&info,deviceName);
if (!info.deviceOverlap)
{
printf("Compute device can't use streams and should be discarded.");
exit(EXIT_FAILURE);
}
}
int main()
{
checkDevice();
int *in_h;
const int length = 10000;
cudaHostAlloc((void**)&in_h,length*sizeof(int),cudaHostAllocDefault);
printf("segfault comming %d\n",in_h);
for (int i = 0;i<length;i++)
{
in_h[i]=2; // Segfaults here
}
return EXIT_SUCCESS;
}
~
Invocation
[id129]$ nvcc fun.cu
[id129]$ ./a.out
segfault comming 327641824
Segmentation fault (core dumped)
Details
Program is run in interactive mode on a cluster. I was told that an invocation of the program from the compute node pushes it to the cluster. Have not had any trouble with other home made toy cuda codes.
Edit
cudaError_t err = cudaHostAlloc((void**)&in_h,length*sizeof(int),cudaHostAllocDefault);
printf("Error status is %s\n",cudaGetErrorString(err));
gives driver error...
Error status is CUDA driver version is insufficient for CUDA runtime version

Always check for Errors. It is likely that cudaHostAlloc is failing to allocate any memory. If it fails, you are not bailing but are rather writing to unallocated address space. When using malloc it allocates memory as requested and does not fail. But there are cases when malloc may result in failures as well, so it is best to do checks on the pointer before writing into it.
For future, it may be best to do something like this
int *ptr = NULL;
// Allocate using cudaHostAlloc or malloc
// If using cudaHostAlloc check for success
if (!ptr) ERROR_OUT();
// Write to this memory
EDIT (Response to edit in the question)
The error message indicates you have an older driver compared to the toolkit. If you do not want to be stuck for a while, try to download an older version of cuda toolkit that is compatible with your driver. You can install it in your user account and use its nvcc + libraries for temporarily.

Your segfault is not caused by the writes to the block of memory allocated by cudaHostAlloc, but rather from trying to 'free' an address returned from cudaHostAlloc. I was able to reproduce your problem using the code you provided, but replacing free with cudaFreeHost fixed the segfault for me.
cudaFreeHost

How to get a specific memory address using C

For my bachelor thesis i want to visualize the data remanence of memory and how it persists after rebooting a system.
I had the simple idea to mmap a picture to memory, shut down my computer, wait x seconds, boot the computer and see if the picture is still there.
int mmap_lena(void)
{
FILE *fd = NULL;
size_t lena_size;
void *addr = NULL;
fd = fopen("lena.png", "r");
fseek(fd, 0, SEEK_END);
lena_size = ftell(fd);
addr = mmap((void *) 0x12345678, (size_t) lena_size, (int) PROT_READ, (int) MAP_SHARED, (int) fileno(fd), (off_t) 0);
fprintf(stdout, "Addr = %p\n", addr);
munmap((void *) addr, (size_t) lena_size);
fclose(fd);
fclose(fd_log);
return EXIT_SUCCESS;
}
I ommitted checking return values for clarities sake.
So after the mmap i tried to somehow get the address, but i usually end up with a segmentation fault as to my understanding the memory is protected by my operating system.
int fetch_lena(void)
{
FILE *fd = NULL;
FILE *fd_out = NULL;
size_t lenna_size;
FILE *addr = (FILE *) 0x12346000;
fd = fopen("lena.png", "r");
fd_out = fopen("lena_out.png", "rw");
fseek(fd, 0, SEEK_END);
lenna_size = ftell(fd);
// Segfault
fwrite((FILE *) addr, (size_t) 1, (size_t) lenna_size, (FILE *) fd_out);
fclose(fd);
fclose(fd_out);
return 0;
}
Please also note that i hard coded the adresses in this example, so whenever you run mmap_lena the value i use in fetch_lena could be wrong as the operating system takes the first parameter to mmap only as a hint (on my system it always defaults to 0x12346000 somehow).
If there is any trivial coding error i am sorry as my C skills have not fully developed.
I would like to now if there is any way to get to the data i want without implementing any malloc hooks or memory allocator hacks.
Thanks in advance,
David

One issue you have is that you are getting back a virtual address, not the physical address where the memory resides. Next time you boot, the mapping probably won't be the same.
This can definitly be done within a kernel module in Linux, but I don't think there is any sort of API in userspace you can use.
If you have permission ( and I assume you could be root on this machine if you are rebooting it ), then you can peek at /dev/mem to see the actual phyiscal layout. Maybe you should try sampling values, reboot, and see how many of those values persisted.

There is a similar project where a cold boot attack is demonstrated. The source code is available, maybe you can get some inspiration there.
However, AFAIR they read out the memory without loading an OS first and therefore do not have to mess with the OSs memory protection. Maybe you should try this too to avoid memory being overwritten or cleared by the OS after boot.
(Also check the video on the site, it's pretty impressive ;)

In the question Direct Memory Access in Linux we worked out most of the fundamentals needed to accomplish this. Note, mmap() is not the answer to this for exactly the reasons that were stated by others .. you need a real address, not virtual, which you can only get inside the kernel (or by writing a driver to relay one to userspace).
The simplest method would be to write a character device driver that can be read or written to, with an ioctl to give you a valid start or ending address. Again, if you want pointers on the memory management functions to use in the kernel, see the question that I've linked to .. most of it was worked out in the comments in the first (and accepted) answer.

Your test code looks odd
FILE *addr = (FILE *) 0x12346000;
fwrite((FILE *) fd_out, (size_t) 1,
(size_t) lenna_size, (FILE *) addr);
You can't just cast an integer to a FILE pointer and expect to get something sane.
Did you also switch the first and last argument to fwrite ? The last argument is supposed to be the FILE* to write to.

You probably want as little OS as possible for this purpose; the more software you load, the more chances of overwriting something you want to examine.
DOS might be a good bet; it uses < 640k of memory. If you don't load HIMEM and instead write your own (assembly required) routine to jump into pmode, copy a block of high memory into low memory, then jump back into real mode, you could write a mostly-real-mode program which can dump out the physical ram (minus however much the BIOS, DOS and your app use). It could dump it to a flash disc or something.
Of course the real problem may be that the BIOS clears the memory during POST.

I'm not familiar with Linux, but you'll likely need to write a device driver. Device drivers must have some way to convert virtual memory addresses to physical memory addresses for DMA purposes (DMA controllers only deal with physical memory addresses). You should be able to use those interfaces to deal directly with physical memory.

I don't say it's the lowest effort, but for completeness sake,
You can Compile a MMU-less Linux Kernel
And then you can have a blast as all addresses are real.
Note 1: You might still get errors if you access few hardware/bios mapped address spaces.
Note 2: You don't access memory using files in this case, you just assign an address to a pointer and read it's content.
int* pStart = (int*)0x600000;
int* pEnd = (int*)0x800000;
for(int* p = pStart; p < pEnd; ++p)
{
// do what you want with *p
}

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight