valgrind is reporting uninitialized memory errors from code like this:
unsigned char buf[100];
struct driver_command cmd;
cmd.len = sizeof(buf);
cmd.buf = buf;
ioctl(my_driver_fd, READ, &cmd);
for(i = 0; i < sizeof(buf); i++)
{
foo(buf[i]); /* <<--- uninit use error from valgrind */
}
If I memset() the buf before the driver call, the error goes away.
Can valgrind detect whether the linux driver is properly writing to the buffer? (I looked at the driver code, and it seems to be correct, but maybe I'm missing something.)
Or does it just pass the driver call through and has no way of knowing that the buffer has been written inside the kernel?
Thanks.
Valgrind obviously can't trace execution into the kernel, but it does know the visible semantics of most system calls. But ioctl is too unpredictable. If you had coded your driver so that that was a read call, it would get it right. That's better practice anyway.
Related
I am trying to write an echo server\client model in C. My code compiles but throws a segmentation fault error at run-time [I believe on the server side process]. When testing in CLion debug environment, the server process is able to execute the accept() system call and enter into a a waiting state until a client connects. Therefore, I believe that the segmentation fault error happens after the client makes the connect() system call.
Here are the relevant snippets of code (only the last part - not full program):
/* [6] LISTEN FOR CONNECTIONS ON BOUND SOCKET===================================================================== */
struct sockaddr_storage ample; /* from Beej Guide 5.6 accept() */
socklen_t ample_sz = sizeof(ample);
fd_activeSock = accept(fd_listenSock, (struct sockaddr *)&established_SERV_param, &le_sz);
if (fd_activeSock == -1) /* Error checking */
{
fprintf(stderr, "\nNo forum for communication...\nTERMINATING PROCESS");
exit(EXIT_FAILURE);
}
printf("\nCommunication Established! What's your sign??");
freeaddrinfo(established_SERV_param); /* free up memory */
/* [7] ACCEPT A CONNECTION (BLOCKING)============================================================================= */
/* MAIN LOOP====================================================================================================== */
while(1)
{
bzero(msg_incoming, 16);
recv(fd_activeSock, msg_incoming, 16, 0);
printf("%s", msg_incoming);
send(fd_activeSock, msg_incoming, 16, 0);
}
When I run both programs in separate terminals (server process first, of course), the last print statement that runs before the error is:
printf("\nCommunication Established! What's your sign??");
The error is output to the server terminal. There is a core dump; for future issues, could someone suggest a beginners tutorial on combing through core dump files. Also, I have run the code with the freeaddrinfo() call commented out and still get a segmentation fault error so I do not believe that this is the issue. Why run it at all? I do not want memory leaks. Thank you for your help.
recv() does not explicitly place a null terminator at the end of the buffer, but printf() expects one.
In the statements:
bzero(msg_incoming, 16);
recv(fd_activeSock, msg_incoming, 16, 0);
printf("%s", msg_incoming);
Although msg_incoming has been zeroed, when it is populated in the recv call, if all 16 elements are populated, there is no guaranteed that the last element of the array was populated with '\0', leaving the buffer as a non-null terminated array. If that happens, A segfault is likely when printf() is called. Or worse, a segfault may not occur, leading you to believe your code works fine. (AKA undefined behavior)
The fix is to check the return value of recv():
ssize_t bytes = recv(fd_activeSock, msg_incoming, 16, 0);
if(bytes <= 0)
{
//handle error/end of message condition
}
else
{
msg_incoming[bytes] = '\0';
printf("%s", msg_incoming);
}
Additional material on Reading data with a socket.
freeaddrinfo(established_SERV_param)
Should be called when established_SERV_param is obtained by getaddrinfo. Here established_SERV_param is probably a stack variable. Hence, you are trying to free a pointer to stack variable.
Umm something is wrong in your program. Since, freeaddrinfo expects a pointer but it is a variable since you use & in call to accept. Removing the call to freeaddrinfo may fix it.
If above is not enough then it is important to see how msg_incoming is defined/allocated. It should not be a const char array or initialised by a string literal making it a const. If it is a pointer it should be adequately allocated memory using malloc.
Analysing core dump:
Compile your code with debug On and optimisation Off
gcc -g -O0
Then open the core file in gdb as
gdb <executable> <core file>
(gdb) bt
Above, bt will show you the back trace where the program crashed. You can go the function it crashed by command fr 0 and print some variables. A tutorial for gdb can found here
I've implemented a char device for my kernel module and implemented a read function for it. The read function calls copy_to_user to return data to the caller. I've originally implemented the read function in a blocking manner (with wait_event_interruptible) but the problem reproduces even when I implement read in a non-blocking manner. My code is running on a MIPS procesor.
The user space program opens the char device and reads into a buffer allocated on the stack.
What I've found is that occasionally copy_to_user will fail to copy any bytes. Moreover, even if I replace copy_to_user with a call to memcpy (only for the purposes of checking... I know this isn't the right thing to do), and print out the destination buffer immediately afterwards, I see that memcpy has failed to copy any bytes.
I'm not really sure how to further debug this - how can I determine why memory is not being copied? Is it possible that the process context is wrong?
EDIT: Here's some pseudo-code outlining what the code currently looks like:
User mode (runs repeatedly):
char buf[BUF_LEN];
FILE *f = fopen(char_device_file, "rb");
fread(buf, 1, BUF_LEN, f);
fclose(f);
Kernel mode:
char_device =
create_char_device(char_device_name,
NULL,
read_func,
NULL,
NULL);
int read_func(char *output_buffer, int output_buffer_length, loff_t *offset)
{
int rc;
if (*offset == 0)
{
spin_lock_irqsave(&lock, flags);
while (get_available_bytes_to_read() == 0)
{
spin_unlock_irqrestore(&lock, flags);
if (wait_event_interruptible(self->wait_queue, get_available_bytes_to_read() != 0))
{
// Got a signal; retry the read
return -ERESTARTSYS;
}
spin_lock_irqsave(&lock, flags);
}
rc = copy_to_user(output_buffer, internal_buffer, bytes_to_copy);
spin_unlock_irqrestore(&lock, flags);
}
else rc = 0;
return rc;
}
It took quite a bit of debugging, but in the end Tsyvarev's hint (the comment about not calling copy_to_user with a spinlock taken) seems to have been the cause.
Our process had a background thread which occasionally launched a new process (fork + exec). When we disabled this thread, everything worked well. The best theory we have is that the fork made all of our memory pages copy-on-write, so when we tried to copy to them, the kernel had to do some work which could not be done with the spinlock taken. Hopefully it at least makes some sense (although I'd have guessed that this would apply only to the child process, and the parent's process pages would simply remain writable, but who knows...).
We rewrote our code to be lockless and the problem disappeared.
Now we just need to verify that our lockless code is indeed safe on different architectures. Easy as pie.
I have always been told(In books and tutorials) that while copying data from kernel space to user space, we should use copy_to_user() and using memcpy() would cause problems to the system. Recently by mistake i have used memcpy() and it worked perfectly fine with out any problems. Why is that we should use copy_to_user instead of memcpy()
My test code(Kernel module) is something like this:
static ssize_t test_read(struct file *file, char __user * buf,
size_t len, loff_t * offset)
{
char ani[100];
if (!*offset) {
memset(ani, 'A', 100);
if (memcpy(buf, ani, 100))
return -EFAULT;
*offset = 100;
return *offset;
}
return 0;
}
struct file_operations test_fops = {
.owner = THIS_MODULE,
.read = test_read,
};
static int __init my_module_init(void)
{
struct proc_dir_entry *entry;
printk("We are testing now!!\n");
entry = create_proc_entry("test", S_IFREG | S_IRUGO, NULL);
if (!entry)
printk("Failed to creats proc entry test\n");
entry->proc_fops = &test_fops;
return 0;
}
module_init(my_module_init);
From user-space app, i am reading my /proc entry and everything works fine.
A look at source code of copy_to_user() says that it is also simple memcpy() where we are just trying to check if the pointer is valid or not with access_ok and doing memcpy.
So my understanding currently is that, if we are sure about the pointer we are passing, memcpy() can always be used in place of copy_to_user.
Please correct me if my understanding is incorrect and also, any example where copy_to_user works and memcpy() fails would be very useful. Thanks.
There are a couple of reasons for this.
First, security. Because the kernel can write to any address it wants, if you just use a user-space address you got and use memcpy, an attacker could write to another process's pages, which is a huge security problem. copy_to_user checks that the target page is writable by the current process.
There are also some architecture considerations. On x86, for example, the target pages must be pinned in memory. On some architectures, you might need special instructions. And so on. The Linux kernels goal of being very portable requires this kind of abstraction.
This answer may be late but anyway copy_to_user() and it's sister copy_from_user() both do some size limits checks about user passed size parameter and buffer sizes so a read method of:
char name[] = "This message is from kernel space";
ssize_t read(struct file *f, char __user *to, size_t size, loff_t *loff){
int ret = copy_to_user(to, name, size);
if(ret){
pr_info("[+] Error while copying data to user space");
return ret;
}
pr_info("[+] Finished copying data to user space");
return 0;
}
and a user space app read as read(ret, buffer, 10); is OK but replace 10 with 35 or more and kernel will emit this error:
Buffer overflow detected (34 < 35)!
and cause the copy to fail to prevent memory leaks. Same goes for copy_from_user() which will also make some kernel buffer size checks.
That's why you have to use char name[] and not char *name since using pointer(not array) makes determining size not possible which will make kernel emit this error:
BUG: unable to handle page fault for address: ffffffffc106f280
#PF: supervisor write access in kernel mode
#PF: error_code(0x0003) - permissions violation
Hope this answer is helpful somehow.
I am new to CUDA and I want to use cudaHostAlloc. I was able to isolate my problem to this following code. Using malloc for host allocation works, using cudaHostAlloc results in a segfault, possibly because the area allocated is invalid? When I dump the pointer in both cases it is not null, so cudaHostAlloc returns something...
works
in_h = (int*) malloc(length*sizeof(int)); //works
for (int i = 0;i<length;i++)
in_h[i]=2;
doesn't work
cudaHostAlloc((void**)&in_h,length*sizeof(int),cudaHostAllocDefault);
for (int i = 0;i<length;i++)
in_h[i]=2; //segfaults
Standalone Code
#include <stdio.h>
void checkDevice()
{
cudaDeviceProp info;
int deviceName;
cudaGetDevice(&deviceName);
cudaGetDeviceProperties(&info,deviceName);
if (!info.deviceOverlap)
{
printf("Compute device can't use streams and should be discarded.");
exit(EXIT_FAILURE);
}
}
int main()
{
checkDevice();
int *in_h;
const int length = 10000;
cudaHostAlloc((void**)&in_h,length*sizeof(int),cudaHostAllocDefault);
printf("segfault comming %d\n",in_h);
for (int i = 0;i<length;i++)
{
in_h[i]=2; // Segfaults here
}
return EXIT_SUCCESS;
}
~
Invocation
[id129]$ nvcc fun.cu
[id129]$ ./a.out
segfault comming 327641824
Segmentation fault (core dumped)
Details
Program is run in interactive mode on a cluster. I was told that an invocation of the program from the compute node pushes it to the cluster. Have not had any trouble with other home made toy cuda codes.
Edit
cudaError_t err = cudaHostAlloc((void**)&in_h,length*sizeof(int),cudaHostAllocDefault);
printf("Error status is %s\n",cudaGetErrorString(err));
gives driver error...
Error status is CUDA driver version is insufficient for CUDA runtime version
Always check for Errors. It is likely that cudaHostAlloc is failing to allocate any memory. If it fails, you are not bailing but are rather writing to unallocated address space. When using malloc it allocates memory as requested and does not fail. But there are cases when malloc may result in failures as well, so it is best to do checks on the pointer before writing into it.
For future, it may be best to do something like this
int *ptr = NULL;
// Allocate using cudaHostAlloc or malloc
// If using cudaHostAlloc check for success
if (!ptr) ERROR_OUT();
// Write to this memory
EDIT (Response to edit in the question)
The error message indicates you have an older driver compared to the toolkit. If you do not want to be stuck for a while, try to download an older version of cuda toolkit that is compatible with your driver. You can install it in your user account and use its nvcc + libraries for temporarily.
Your segfault is not caused by the writes to the block of memory allocated by cudaHostAlloc, but rather from trying to 'free' an address returned from cudaHostAlloc. I was able to reproduce your problem using the code you provided, but replacing free with cudaFreeHost fixed the segfault for me.
cudaFreeHost
as stated in: http://www.kernel.org/doc/htmldocs/kernel-hacking.html#routines-copy this functions "can" sleep.
So, do I always have to do a lock (e.g. with mutexes) when using this functions or are there exceptions?
I'm currently working on a module and saw some Kernel Oops at my system, but cannot reproduce them. I have a feeling they are fired because I'm currently do no locking around copy_[to/from]_user(). Maybe I'm wrong, but it smells like it has something to do with it.
I have something like:
static unsigned char user_buffer[BUFFER_SIZE];
static ssize_t mcom_write (struct file *file, const char *buf, size_t length, loff_t *offset) {
ssize_t retval;
size_t writeCount = (length < BUFFER_SIZE) ? length : BUFFER_SIZE;
memset((void*)&user_buffer, 0x00, sizeof user_buffer);
if (copy_from_user((void*)&user_buffer, buf, writeCount)) {
retval = -EFAULT;
return retval;
}
*offset += writeCount;
retval = writeCount;
cleanupNewline(user_buffer);
dispatch(user_buffer);
return retval;
}
Is this save to do so or do I need locking it from other accesses, while copy_from_user is running?
It's a char device I read and write from, and if a special packet in the network is received, there can be concurrent access to this buffer.
You need to do locking iff the kernel side data structure that you are copying to or from might go away otherwise - but it is that data structure you should be taking a lock on.
I am guessing your function mcom_write is a procfs write function (or similar) right? In that case, you most likely are writing to the procfs file, your program being blocked until mcom_write returns, so even if copy_[to/from]_user sleeps, your program wouldn't change the buffer.
You haven't stated how your program works so it is hard to say anything. If your program is multithreaded and one thread writes while another can change its data, then yes, you need locking, but between the threads of the user-space program not your kernel module.
If you have one thread writing, then your write to the procfs file would be blocked until mcom_write finishes so no locking is needed and your problem is somewhere else (unless there is something else that is wrong with this function, but it's not with copy_from_user)