C crypt function, malloc and valgrind - c

My man page for the crypt function states that:
"The return value points to static data whose content is overwritten by each call."
However, when using the SHA512 version (ie, the salt starts $6$...), valgrind does not seem to agree. Unless I free the pointer that crypt returns, it gets upset:
120 bytes in 1 blocks are still reachable in loss record 1 of 1
at 0x4C2BBA0: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
by 0x4C2DF4F: realloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
by 0x521F4D4: __sha512_crypt (sha512-crypt.c:437)
Conversely, valgrind is fine if I use the DES version (so salt does not start with $6$ or similar).
What's going on here and is this behaviour explained anywhere?
Thanks in advance.
EDIT: Platform is Ubuntu 15.04 64-bit. Here's a program:
#define _XOPEN_SOURCE 700
#include <unistd.h>
int main(int argc, char** argv) {
char *hash = crypt("password", "$6$Salty");
return 0;
}

For some crypt variations, the preallocated buffer is not big enough, so it allocates (via malloc) a buffer that will be reused by the next call to crypt that needs a large buffer (possibly after reallocing it). That's why it is noted as "still reachable" by valgrind -- there's a static variable in the library that points at the block.
If you were to free it, it's likely the next call to crypt would misbehave (likely giving a runtime error about reusing a freed block).
No matter how many times you call crypt there will be one block identified by valgrind like this. It isn't a real memory leak, just constant overhead from the library that is pretty much impossible to avoid.
Generally you want to ignore valgrind messages about "still reachable" blocks unless the amount of memory is unexpectedly large, or the requests are coming from a place that should not be storing the returned pointers in global variables.

Related

By how many bytes is it possible to exceed an allocated block without getting segmentation fault?

I learned that whenever malloc is called, the actual memory given to the program by the OS is not exactly the requested size, but is rounded up to page size. From what I know, page size is 1024 or 4096.
Based on this logic (and correct me if it's wrong), writing beyond my allocated block won't always cause segmentation fault, as this fault is given by the kernel (which has given me a full page and doesn't care what I do so long as I stay inside of it).
The weird thing is that in the program bellow I requested 8 bytes from malloc, and then wrote 80000 (sizeof(size_t) * 10000) bytes further, without getting segmentation fault. I did get invalid read and write valgrind errors though.
Can someone shed light on the topic?
#include <stdio.h>
#include <stdlib.h>
int main()
{
size_t *ptr = (size_t *)malloc(sizeof(size_t));
size_t forward = 10000;
ptr += forward;
*ptr = 8;
printf("%lu\n", *ptr);
ptr -= forward;
free(ptr);
return (0);
}
You are confusing two very different things. If you ask malloc for 1 byte, it may allocate a 4,096 byte page from the operating system but it may only reserve 16 bytes of that block for this call to malloc. The next call to malloc may get another 16 bytes from that same 4,096 byte page.
You can't assume that just because you didn't get a segmentation fault that means you only accessed space that the implementation has reserved for your block. You can be writing on top of other objects being used for other purposes.
You're assuming that malloc requested just one page from the OS for your allocation, and that the subsequent pages remained unmapped. That's not necessarily true. malloc will typically request many pages at a time and then use them for later allocations if possible, thus making fewer system calls and reducing that overhead.
So it's likely that the several pages following your allocation are also mapped for your process, which is why you don't get a segfault. Instead, you overwrote memory that could be in use for something important.
Indeed, it's possible that malloc requested a larger chunk before main started, for internal C library structures, and that your allocation is just being placed within that chunk. In my test, using strace to see the brk() system call, this chunk was 33 pages in size (132 kB), and your block is in the first page of that chunk. So your 80 kB overrun is still within that mapped region.
So the answer to the title question is "it depends on where your block is located within its page, and which surrounding pages happen to be mapped". These in turn depend on the precise algorithm used by malloc and the pattern of other allocations and frees done by your code, or library code, up to that point, none of which you can really predict. It is possible in principle to find out which pages are mapped (e.g. on Linux by parsing /proc/self/maps) but this can change unpredictably as memory is allocated and freed by your code or library functions. So the practical answer is "you don't know".
Basically, you shouldn't make any assumptions about what will or won't happen when you write outside the boundaries of malloced memory. The only time when you can really be assured of getting a segfault is when you mmaped and mprotected the memory yourself.

Does c keep track of allocated heap memory?

I know people keep telling me to read the documentation, and I haven't got the hang of that yet. I don't specifically know what to look for inside the docs. If you have a tip on how to get used to reading the docs, give me your tips as a bonus, but for now here is my question. Say in a simple program if we allocate some chunk of memory I can free it once I am done, right? here is one such a program that does nothing, but allocates and deallocates memory in the heap.
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
char *s = malloc(10);
free(s);
return (0);
}
After we compile this, if we run valgrind we can see that everything is freed. Now, here is just a little bit of a twist to the previous program.
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
char *s = malloc(10);
s++;
free(s);
return (0);
}
Here, before I free it, I increment the address by one, and all hell breaks loose. Free doesn't seem to now that this chunk of memory was allocated (even though it is a subset of allocated memory). In case you want to see it, here is my error message.
*** Error in `./a.out': free(): invalid pointer: 0x0000000001e00011 ***
Aborted (core dumped).
So this got me thinking.
Does c keep track of memory's allocated on the heap
If not, how does it know what to free and what to not?
And if it has such a knowledge, why doesn't c automatically deallocate such memories just before exiting. why memory leaks?
The C standard description of free says the following:
The free function causes the space pointed to by ptr to be deallocated, that is, made available for further allocation. If ptr is a null pointer, no action occurs. Otherwise, if the argument does not match a pointer earlier returned by a memory management function, or if the space has been deallocated by a call to free or realloc, the behavior is undefined.
Since you changed the pointer, it does not match a pointer returned by a memory management function (i.e. m/re/calloc), the behaviour is undefined, and anything can happen. Including the runtime library noticing that you've tried to free an invalid pointer, but the runtime is not required to do that either.
As for
Does C keep track of memory's allocated on the heap
It might... but it does not necessarily have to...
If not, how does it know what to free and what to not?
Well, if it does free the memory pointed to by a pointer, then it obviously need to have some kind of bookkeeping about the sizes of the allocations... but it does not need to be able to figure out if any pointers are still pointing to that memory area.
And if it has such a knowledge, why doesn't c automatically deallocate such memories just before exiting. why memory leaks?
Usually memory is freed after the process exits by the operating system. That's not the problem. The real problem are the leaks that happen while the program is still running.
My knowledge is limited. But I think I can solve some of your question
Does c keep track of memory's allocated on the heap
C does not track anything. The OS, however, tracks and knows which memory region are used and which is not.
If not, how does it know what to free and what to not?
See how-does-free-know-how-much-to-free
Put it simplly. When calling malloc, you give it a size. malloc uses an extra 8 bytes right in front of the pointer it returns to "remember" this size information. When you free said pointer, free will know both the address and read 8 bytes before the pointer to get the size info, then happily release the memory to the operating system.
And if it has such a knowledge, why doesn't c automatically deallocate such memories just before exiting. why memory leaks?
The OS knows the information. Thus when a C program exits, the OS will take in charge and free memory you didn't free explicitly.
*** Error in ./a.out': free(): invalid pointer: 0x0000000001e00011 ***
Aborted (core dumped).`
As for this. I give of snippet of glibc free function
glibc malloc will allocate memory region in an aligned pattern, say 32 byte aligned. So when you did s++, it is not 32 byte aligned any more.
Now you may think what if i do s += 32; and set a pretending size information before s. I tried this. Sadly I can't naively trick glibc' free function. It has other information to prevent this. I stopped to dig in right now...

C read reallocates buffer

I'm reading standard input on linux. I provide read with buffer that has insufficient length (only two characters), buffer should overflow and Segmentation fault should occure. However the program runs ok. Why?
Compiled with:
gcc file.c -ansi
Runned with:
echo abcd | ./a.out
Program:
#include<stdio.h>
#define STDIN 0
int main() {
/* This buffer is intentionally too small for input */
char * smallBuffer = (char *) malloc( sizeof(char) * 2 );
int readedBytes;
readedBytes = read(STDIN, smallBuffer, sizeof(char) * 4);
printf("Readed: %i, String:'%s'\n", readedBytes, smallBuffer);
return 0;
}
Output:
Readed: 4, String:'abcd'
It is generally wrong to expect a segmentation fault in this kind of cases. You see, buffer overflows result in undefined behavior. It means that a behavior of such code is unpredictable. It may or may not result in segmentation fault.
Technically, when you allocate a buffer of two bytes, for example, there are two possible scenarios.
First is when a buffer is allocated on stack. The stack itself is larger than 2 bytes, and if you overflow that buffer, memory protection unit will still allow you to write at the memory "outside" that buffer. In this case you won't get a segmentation, but could potentially mess up other variables stored "nearby" on the stack, this kind of situation is generally referred to as “stack smashing”.
The second possible scenario is allocating memory dynamically (i.e. using malloc()). In that case it is very likely that actually allocated buffer is a larger or is placed on the same page as memory allocated/reserved before. In that case, the program would write past the buffer of two bytes. It may or may not receive a segmentation violation signal but nevertheless the behavior is undefined.
Sometimes, such cases are hard to find without extremely special care. There are tools that help to trace alike issues. Valgrind is one of them, for example.
On a side note, you may only expect a segmentation fault if you know for sure that a virtual address you are using is invalid or is being protected from read, write, or execution by the memory protection unit (which might not exist at all on the hardware you are running your application).
Hope it helps. Good Luck!
malloc guarantees to provide you with at least the amount of memory you request. To see an error you can use a program such as valgrind and you'll see the following:
==22265== Syscall param read(buf) points to unaddressable byte(s)
==22265== at 0x4F188B0: __read_nocancel (syscall-template.S:82)
==22265== by 0x4005B4: main (in /home/def/p/cm/Git/git/a.out)
==22265== Address 0x51f1042 is 0 bytes after a block of size 2 alloc'd
==22265== at 0x4C2B6CD: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==22265== by 0x400595: main (in /home/def/p/cm/Git/git/a.out)
In this case, the program overwrites some of its own memory. OS does not notice this.
Segmentation fault occurs when a process tries to access a memory that does not belong to it. However, an operating system assigns memory blocks not on a per-byte basis, but with larger blocks - pages (e.g. size of 4 KB is frequently used). So when you allocate two bytes, these two bytes are placed by heap manager on some memory page (either previously allocated or a new one), and the whole memory page is marked as belonging to your process. It is highly probable that these two bytes will not end up at the end of memory page, that is your program will be able to write after these two bytes without any OS exception at the time of writing (but most probably it will fire at you later).
Too small a buffer is not a guarantee that the program will crash. It depends on what data exists in the bytes following the buffer, how the compiler arranges the executable, and how the operating system organizes memory.
Chances are that the bytes following your buffer already "belong" to your program and are padding or otherwise store nothing of import.
The 3rd parameter is not the size of the buffer, but the number of bytes to read. So you call the function and say "here's a stream, read 4 bytes from it and put in in this buffer". But it doesn't know the buffer size (it only knows the file size). So what it happens it reads as much as it can do and puts it in your buffer (assuming you supplied a buffer large enough). So what you get then is memory corruption. You're program may work OK in this simple case, but usually it just fails unexpectedly in some other place.
I think that you should put particular attention to what malloc() really does, a malloc() call under linux it's not only unlikely to fail but it's not granting you a real reservation of space even if it's returning a positive response.
This behaviour is tipically named "optimistic memory allocation strategy" or "overcommit", it's strictly related to the kernel and programming in C under linux it's not that easy, in my opinion you should switch to C++, you will find a familiar syntax to start with and it makes much more sense to use C++ for productivity than C this days, also with a simple RAII approach C++ is safer than C.

In memcpy how to handle memory overflow?

int main ()
{
char *destination;
char source[10] = "jigarpatel";
destination = (char*) malloc(5);
memcpy(destination, source, 10);
printf("%s and size is %d", destination, strlen(destination));
free(destination);
return 0;
}
Output:
jigarpatel and size is 10
Question:
Here I have allocated just 5 bytes to destination, but the destination length is 10, why is this?
Where are the other bytes stored?
Is it safe in embedded system? Any chances of a crash or a segmentation fault?
How can I detect this type of mistake?
Another Question :
see i am writing one library where user asks for needed memory and library says allocate 10 bytes & then user malloc 10 bytes & pass its pointer to library. now library store some data there...now see if library said to allocate 10 bytes but user has allocated only 5 bytes & give that pointer to library then how can i detect that user hasnt malloc suffecient memory.
Here i have just allocated only 5 bytes to destination then still why destination length is 10?
printf("%s and size is %d",destination,strlen(destination));
The strlen() considers \0 as the end of the string, So it continues counting until it enounters a \0. This does not mean destination has that much memory allocated.
You are writing beyond the bounds of the allocated memory and luckily it does not crash, but sure is an Undefined Behavior to do so. An Undefined Behavior means anything can happen & the behavior cannot be explained, luckily Your program does not crash.
where other bytes are stored?
The other bytes overwrite some other memory allocation beyond the 5 bytes allocated to destination.
Is it safe in embedded system? any chances to crash or segmention fault?
It is NOT safe. It causes an Undefined Behavior and If you are lucky that it works.
how can i detect this type of mistakes?
Each platform has certain Memory profiling tools like Valgrind for Linux/Unix, You can use them and they will point out such memory overrites.
This sometimes happens to work but it's definitely not safe on any system. You are writing past the end of what malloc is giving you. From the viewpoint of C it's illegal, but from the viewpoint of the OS it might be ok (that memory might be paged with adequate permissions).
Another problem is that if you later call malloc again, it might give you some memory including those 5 bytes that you are using without asking. This should provide some interesting debugging sessions.
here i have just allocated only 5 bytes to destination then still why
destination length is 10.?
The destination has only 5 bytes allocated, but because of the way malloc works on your system, this doesn't cause an invalid write, since it happens to be inside a valid page.
where other bytes are stored.?
Right after the first 5, for now.
Is it safe in embedded system.? any chances to crash or segmention
fault.?
It's unsafe in any system. Many chances of crashes.
how can i detect this type of mistakes.?
Using valgrind or any memory debugger. A gentle introduction to valgrind can be found here.
¹
For example, on Linux (Glibc), small (~64 bytes) malloc requests are served from a small list of preallocated pages called "fastbins". Each fastbin has a fixed size, and hence using an allocated fastbin up to that size would not trigger a segmentation violation. More details on how this happens can be found here, for a more rigorous treatment of the topic you might refer to the malloc source code.
No it's not safe - writing beyond the end of an allocated block will typically cause heap corruption.
Use tools such as valgrind to catch this and other kinds of error.
You are luck that it has not crashed. The third parameter of memcpy should be 4 and then you should put the null character in the 5 position to terminate the string.

Malloc -> how much memory has been allocated?

# include <stdio.h>
# include <stdbool.h>
# include <string.h>
# include <stdlib.h>
int main ()
{
char * buffer;
buffer = malloc (2);
if (buffer == NULL){
printf("big errors");
}
strcpy(buffer, "hello");
printf("buffer is %s\n", buffer);
free(buffer);
return 0;
}
I allocated 2 bytes of memory to the pointer/char buffer yet if I assign the C-style string hello to it, it still prints the entire string, without giving me any errors. Why doesn't the compiler give me an error telling me there isn't enough memory allocated? I read a couple of questions that ask how to check how much memory malloc actually allocates but I didn't find a concrete answer. Shouldn't the free function have to know exactly how much memory is allocated to buffer?
The compiler doesn't know. This is the joy and terror of C. malloc belongs to the runtime. All the compilers knows is that you have told it that it returns a void*, it has no idea how much, or how much strcpy is going to copy.
Tools like valgrind detect some of these errors. Other programming languages make it harder to shoot yourself in the foot. Not C.
No production malloc() implementation should prevent you from trying to write past what you allocated. It is assumed that if you allocate 123 bytes, you will use all or less than what you allocated. malloc(), for efficiency sake, has to assume that a programmer is going to keep track of their pointers.
Using memory that you didn't explicitly and successfully ask malloc() to give you is undefined behavior. You might have asked for n bytes but got n + x, due to the malloc() implementation optimizing for byte alignment. Or you could be writing to a black hole. You never can know, that's why it's undefined behavior.
That being said ...
There are malloc() implementations that give you built in statistics and debugging, however these need to be used in lieu of the standard malloc() facility just like you would if you were using a garbage collected variety.
I've also seen variants designed strictly for LD_PRELOAD that expose a function to allow you to define a callback with at least one void pointer as an argument. That argument expects a structure that contains the statistical data. Other tools like electric fence will simply halt your program on the exact instruction that resulted in an overrun or access to invalid blocks. As #R.. points out in comments, that is great for debugging but horribly inefficient.
In all honesty or (as they say) 'at the end of the day' - it's much easier to use a heap profiler such as Valgrind and its associated tools (massif) in this case which will give you quite a bit of information. In this particular case, Valgrind would have pointed out the obvious - you wrote past the allocated boundary. In most cases, however when this is not intentional, a good profiler / error detector is priceless.
Using a profiler isn't always possible due to:
Timing issues while running under a profiler (but those are common any time calls to malloc() are intercepted).
Profiler is not available for your platform / arch
The debug data (from a logging malloc()) must be an integral part of the program
We used a variant of the library that I linked in HelenOS (I'm not sure if they're still using it) for quite a while, as debugging at the VMM was known to cause insanity.
Still, think hard about future ramifications when considering a drop in replacement, when it comes to the malloc() facility you almost always want to use what the system ships.
How much malloc internally allocates is implementation-dependent and OS-dependent (e.g. multiples of 8 bytes or more). Your writing into the un-allocated bytes may lead to overwriting other variable's values even if your compiler and run-time dont detect the error. The free-function remembers the number of bytes allocated separate from the allocated region, for example in a free-list.
Why doesnt the compiler give me an
error telling me there isnt enough
memory allocated ?
C does not block you from using memory you should not. You can use that memory, but it is bad and result in Undefined Behaviour. You are writing in a place you should not. This program might appear as running correctly, but might later crash. This is UB. you do not know what might happen.
This is what is happening with your strcpy(). You write in place you do not own, but the language does not protect you from that. So you should make sure you always know what and where you are writing, or make sure you stop when you are about to exceed valid memory bounds.
I read a couple of questions that ask
how to check how much memory malloc
actually allocates but I didn't find a
concrete answer. Shouldn't the 'free'
function have to know how much memory
is exactly allocated to 'buffer' ?
malloc() might allocate more memory than you request cause of bit padding.
More : http://en.wikipedia.org/wiki/Data_structure_alignment
free() free-s the exact same amount you allocated with malloc(), but it is not as smart as you think. Eg:
int main()
{
char * ptr = malloc(10);
if(ptr)
{
++ptr; // Now on ptr+1
free(ptr); // Undefined Behaviour
}
}
You should always free() a pointer which points to the first block. Doing a free(0) is safe.
You've written past the end of the buffer you allocated. The result is undefined behavior. Some run time libraries with the right options have at least some ability to diagnose problems like this, but not all do, and even those that do only do so at run-time, and usually only when compiled with the correct options.
Malloc -> how much memory has been allocated?
When you allocate memory using malloc. On success it allocates memory and default allocation is 128k. first call to malloc gives you 128k.
what you requested is buffer = malloc (2); Though you requested 2 bytes. It has allocated 128k.
strcpy(buffer, "hello"); Allocated 128k chunk it started processing your request. "Hello"
string can fit into this.
This pgm will make you clear.
int main()
{
int *p= (int *) malloc(2);---> request is only 2bytes
p[0]=100;
p[1]=200;
p[2]=300;
p[3]=400;
p[4]=500;
int i=0;
for(;i<5;i++,p++)enter code here
printf("%d\t",*p);
}
On first call to malloc. It allocates 128k---> from that it process your request (2 bytes). The string "hello" can fit into it. Again when second call to malloc it process your request from 128k.
Beyond 128k it uses mmap interface. You can refer to man page of malloc.
There is no compiler/platform independent way of finding out how much memory malloc actually allocated. malloc will in general allocation slightly more than you ask it for see here:
http://41j.com/blog/2011/09/finding-out-how-much-memory-was-allocated/
On Linux you can use malloc_usable_size to find out how much memory you can use. On MacOS and other BSD platforms you can use malloc_size. The post linked above has complete examples of both these techniques.

Resources