Odd behavior regarding malloc()

Odd behavior regarding malloc() - c

Why does this work?
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main()
{
char * abc = malloc(1) + 4; //WRONG use of malloc.
char * xyz = "abc";
strcpy(abc, xyz); //Should fail.
printf("%s\n", abc); //Prints abc
}
I would expect the strcpy to fail for not having enough memory, as I'm passing in 1 to the argument of malloc(). Instead, this compiles and runs flawlessly (in both GCC on linux and dev c++ on Windows).
Is this expected behavior, or a happy coincidence?
I assume this isn't good practice, but why does it work?
Without the +4 at the end of malloc(), I get a segmentation fault. This is mostly what I'm curious about.

This is undefined behavior. Don't do that!!.
You're trying to access memory location beyond the allocated region. So, the memory location is invalid and accessing invalid memory invokes UB.
FWIW,
there is noting in the C standard that stops you from accessing out of bound (invalid) memory and
neither does strcpy() check for the size of the destination buffer compared to the source length
so, this code (somehow) compiles. As soon as you run it and it hits UB, nothing is guaranteed anymore.
P.S - the only guaranteed thing here is undefined behavior.

This is basically another demonstration of the fact that pointers in C are low-level and (typically) not checked. You said you expected it to "fail for not having enough memory", but think about it: what did you expect to fail? The strcpy function most assuredly does not check that there's enough room for the string it's copying. It has no way to do so; all it gets is a pointer. It just starts copying characters, and in practice it either succeeds or dies on a segmentation violation. (But the point is it does not die on "out of memory".)

Do not rely on that behavior. The answerers responding "vigorously" are justified in that relying on such behavior can lurk undetected for years and then, one day, a minor adjustment to the runtime system suddenly causes catastrophic failure.
It seems to work because, since the advent of 32-bit computers, many—if not most—C runtime libraries implement malloc/free which manage the heap with 16-byte granularity. That is, calling malloc() with a parameter from 1 to 16 provides the same allocation. SO you get a little more memory than you asked for and that allows it to execute.
A tool like valgrind would certainly detect a problem.

Related

C strange int array pointer

int *array; //it allocate a pointer to an int right?
array=malloc(sizeof(int); //allocate the space for ONE int right?
scanf("%d", &array[4]); //must generate a segmentation fault cause there is not allocated space enough right?
printf("%d",array[4]); //it shouldn't print nothing...
but it print 4! why?

Reading or writing off the end of an array in C results in undefined behavior, meaning that absolutely anything can happen. It can crash the program, or format the hard drive, or set the computer on fire. In your case, it just so happens to be the case that this works out perfectly fine, probably because malloc pads the length of the allocated memory for efficiency reasons (or perhaps because you're trashing memory that malloc wants to use later on). However, this isn't at all portable, isn't guaranteed to work, and is a disaster waiting to happen.
Hope this helps!

Cause the operating system happens to have this memory you have asked. This code is by no means guaranteed to run on another machine or another time.

C doesn't check your code for getting of boundaries of an array size, that depends to the programmer. I guess you can call it an undefined behavior (although it's not exactly what they mean when they say an undefined behavior because mostly the memory would be in a part of the stack or the heap so you can still get to it.
When you say array[4] you actually say *(array + 4) which is of course later translated to *(array + 4*sizeof(int)) and you actually go to a certain space in the memory, the space exists, it could be maybe a read-only, maybe a place of another array or variable in your program or it might just work perfectly. No guarantee it'll be an error, but it's not what undefined behavior.
To understand more about undefined behavior you can go to this article (which I find very interesting).

free() replaces string with zero?

I could use a little help with free().
When I run the following:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main(int argc, const char *argv[])
{
char *mystring = malloc(6 * sizeof(char));
strcpy(mystring, "Hello");
printf("%p\n", mystring);
printf("%s\n", mystring);
free(mystring);
printf("%p\n", mystring);
printf("%d\n", *mystring);
return 0;
}
I get:
0x8f46008
Hello
0x8f46008
0
Did free() replace the string 'Hello' from memory with zero?
Note: This is just for academic purposes. I would never reference freed memory for real.
Thanks,
Frank

Once you have freed a block of memory, reading that memory again results in undefined behavior and is a serious security and stability hazard. You cannot count on anything holding true for memory that has been freed, so there is no guarantee whether the memory will be zeroed or not. Unless you are absolutely sure of what you're doing, don't reference memory after you've freed it.
As an amusing anecdote about this, the original SimCity game had a bug in it where it referenced memory that had been freed. In DOS, this ended up not causing a crash, but when it was ported to Windows the game started crashing repeatedly. The Windows team had to specifically build in a case into the OS such that if SimCity was run, a custom memory manager would be used to prevent this sort of crash. You can read the full story here.
In short, once it's freed, don't touch it. Otherwise you risk bugs that some poor programmer years down the line will have to fix for you. Tools like valgrind exist to detect these sorts of errors specifically because they're so nasty.
Hope this helps!

The contents of mystring (*mystring, mystring[0], and friends) are undefined after you free the memory. You can not rely on it containing "Hello". You also cannot rely on it containing an ASCII NUL (as you see here).
You also cannot rely on reading it not causing a segmentation fault. Don't do it.
If you were to run this program in a memory checker like valgrind, you would see an error here about access to freed memory.

Maybe in debug mode on your specific computer and compiled with your specific compiler, in general though you should expect that piece of code to crash (or worse).

Writing more characters than malloced. Why does it not fail?

Why does the following work and not throw some kind of segmentation fault?
char *path = "/usr/bin/";
char *random = "012";
// path + random + \0
// so its malloc(13), but I get 16 bytes due to memory alignment (im on 32bit)
newPath = (char *) malloc(strlen(path) + strlen(random) + 1);
strcat(newPath, path);
strcat(newPath, "random");
// newPath is now: "/usr/bin/012\0" which makes 13 characters.
However, if I add
strcat(newPath, "RANDOMBUNNIES");
shouldn't this call fail, because strcat uses more memory than allocated? Consequently, shouldn't
free(newPath)
also fail because it tries to free 16 bytes but I used 26 bytes ("/usr/bin/012RANDOMBUNNIES\0")?
Thank you so much in advance!

Most often this kind of overrun problems doesn't make your program explode in a cloud of smoke and the smell of burnt sulphur. It's more subtle: the variable that is allocated after the overrun variable will be altered, causing unexplainable and seemingly random behavior of the program later on.

The whole program snippet is wrong. You are assuming that malloc() returns something that has at least the first byte set to 0. This is not generally the case, so even your "safe" strcat() is wrong.
But otherwise, as others have said, undefined behavior doesn't mean your program will crash. It only means it can do anything (including crashing, but also not crashing, if you are unlucky).
(Also, you shouldn't cast the return value of malloc().)

Writing more characters than malloced is an Undefined Behavior.
Undefined Behavior means anything can happen and the behavior cannot be explained.

Segmentation fault generally occurs because of accessing the invalid memory section. Here it won't give error(Segmentation fault) because you can still access memory. However you are overwriting other memory locations which is undefined behavior, your code runs fine.

It will fail and not fail at random, depending on the availability of the memory just after the malloc'd memory.
Also when you want to concat random you shouldn't be putting in quotes. that should be
strcat(newPath, random);

Many C library functions do not check whether they overrun. Its up to the programmer to manage the memory allocated. You may just be writing over another variable in memory, with unpredictable effects for the operation of your program. C is designed for efficiency not for pointing out errors in programming.

You have luck with this call. You don't get a segfault because your calls presumably stay in a allocated part of the address space. This is undefined behaviour. The last chars of what has been written are not guaranteed to not be overwritten. This calls may also fail.

Buffer overruns aren't guaranteed to cause a segfault. The behavior is simply undefined. You may get away with writing to memory that's not yours one time, cause a crash another time, and silently overwrite something completely unrelated a third time. Which one of these happens depends on the OS (and OS version), the hardware, the compiler (and compiler flags), and pretty much everything else that is running on your system.
This is what makes buffer overruns such nasty sources of bugs: Often, the apparent symptom shows in production, but not when run through a debugger; and the symptoms usually don't show in the part of the program where they originate. And of course, they are a welcome vulnerability to inject your own code.

Operating systems allocate at a certain granularity which is on my system a page-size of 4kb (which is typical on 32bit machines), whether a malloc() always takes a fresh page from the OS depends on your C runtime library.

Malloc -> how much memory has been allocated?

# include <stdio.h>
# include <stdbool.h>
# include <string.h>
# include <stdlib.h>
int main ()
{
char * buffer;
buffer = malloc (2);
if (buffer == NULL){
printf("big errors");
}
strcpy(buffer, "hello");
printf("buffer is %s\n", buffer);
free(buffer);
return 0;
}
I allocated 2 bytes of memory to the pointer/char buffer yet if I assign the C-style string hello to it, it still prints the entire string, without giving me any errors. Why doesn't the compiler give me an error telling me there isn't enough memory allocated? I read a couple of questions that ask how to check how much memory malloc actually allocates but I didn't find a concrete answer. Shouldn't the free function have to know exactly how much memory is allocated to buffer?

The compiler doesn't know. This is the joy and terror of C. malloc belongs to the runtime. All the compilers knows is that you have told it that it returns a void*, it has no idea how much, or how much strcpy is going to copy.
Tools like valgrind detect some of these errors. Other programming languages make it harder to shoot yourself in the foot. Not C.

No production malloc() implementation should prevent you from trying to write past what you allocated. It is assumed that if you allocate 123 bytes, you will use all or less than what you allocated. malloc(), for efficiency sake, has to assume that a programmer is going to keep track of their pointers.
Using memory that you didn't explicitly and successfully ask malloc() to give you is undefined behavior. You might have asked for n bytes but got n + x, due to the malloc() implementation optimizing for byte alignment. Or you could be writing to a black hole. You never can know, that's why it's undefined behavior.
That being said ...
There are malloc() implementations that give you built in statistics and debugging, however these need to be used in lieu of the standard malloc() facility just like you would if you were using a garbage collected variety.
I've also seen variants designed strictly for LD_PRELOAD that expose a function to allow you to define a callback with at least one void pointer as an argument. That argument expects a structure that contains the statistical data. Other tools like electric fence will simply halt your program on the exact instruction that resulted in an overrun or access to invalid blocks. As #R.. points out in comments, that is great for debugging but horribly inefficient.
In all honesty or (as they say) 'at the end of the day' - it's much easier to use a heap profiler such as Valgrind and its associated tools (massif) in this case which will give you quite a bit of information. In this particular case, Valgrind would have pointed out the obvious - you wrote past the allocated boundary. In most cases, however when this is not intentional, a good profiler / error detector is priceless.
Using a profiler isn't always possible due to:
Timing issues while running under a profiler (but those are common any time calls to malloc() are intercepted).
Profiler is not available for your platform / arch
The debug data (from a logging malloc()) must be an integral part of the program
We used a variant of the library that I linked in HelenOS (I'm not sure if they're still using it) for quite a while, as debugging at the VMM was known to cause insanity.
Still, think hard about future ramifications when considering a drop in replacement, when it comes to the malloc() facility you almost always want to use what the system ships.

How much malloc internally allocates is implementation-dependent and OS-dependent (e.g. multiples of 8 bytes or more). Your writing into the un-allocated bytes may lead to overwriting other variable's values even if your compiler and run-time dont detect the error. The free-function remembers the number of bytes allocated separate from the allocated region, for example in a free-list.

Why doesnt the compiler give me an
error telling me there isnt enough
memory allocated ?
C does not block you from using memory you should not. You can use that memory, but it is bad and result in Undefined Behaviour. You are writing in a place you should not. This program might appear as running correctly, but might later crash. This is UB. you do not know what might happen.
This is what is happening with your strcpy(). You write in place you do not own, but the language does not protect you from that. So you should make sure you always know what and where you are writing, or make sure you stop when you are about to exceed valid memory bounds.
I read a couple of questions that ask
how to check how much memory malloc
actually allocates but I didn't find a
concrete answer. Shouldn't the 'free'
function have to know how much memory
is exactly allocated to 'buffer' ?
malloc() might allocate more memory than you request cause of bit padding.
More : http://en.wikipedia.org/wiki/Data_structure_alignment
free() free-s the exact same amount you allocated with malloc(), but it is not as smart as you think. Eg:
int main()
{
char * ptr = malloc(10);
if(ptr)
{
++ptr; // Now on ptr+1
free(ptr); // Undefined Behaviour
}
}
You should always free() a pointer which points to the first block. Doing a free(0) is safe.

You've written past the end of the buffer you allocated. The result is undefined behavior. Some run time libraries with the right options have at least some ability to diagnose problems like this, but not all do, and even those that do only do so at run-time, and usually only when compiled with the correct options.

Malloc -> how much memory has been allocated?
When you allocate memory using malloc. On success it allocates memory and default allocation is 128k. first call to malloc gives you 128k.
what you requested is buffer = malloc (2); Though you requested 2 bytes. It has allocated 128k.
strcpy(buffer, "hello"); Allocated 128k chunk it started processing your request. "Hello"
string can fit into this.
This pgm will make you clear.
int main()
{
int *p= (int *) malloc(2);---> request is only 2bytes
p[0]=100;
p[1]=200;
p[2]=300;
p[3]=400;
p[4]=500;
int i=0;
for(;i<5;i++,p++)enter code here
printf("%d\t",*p);
}
On first call to malloc. It allocates 128k---> from that it process your request (2 bytes). The string "hello" can fit into it. Again when second call to malloc it process your request from 128k.
Beyond 128k it uses mmap interface. You can refer to man page of malloc.

There is no compiler/platform independent way of finding out how much memory malloc actually allocated. malloc will in general allocation slightly more than you ask it for see here:
http://41j.com/blog/2011/09/finding-out-how-much-memory-was-allocated/
On Linux you can use malloc_usable_size to find out how much memory you can use. On MacOS and other BSD platforms you can use malloc_size. The post linked above has complete examples of both these techniques.

Writing to pointer out of bounds after malloc() not causing error

when I try the code below it works fine. Am I missing something?
main()
{
int *p;
p=malloc(sizeof(int));
printf("size of p=%d\n",sizeof(p));
p[500]=999999;
printf("p[0]=%d",p[500]);
return 0;
}
I tried it with malloc(0*sizeof(int)) or anything but it works just fine. The program only crashes when I don't use malloc at all. So even if I allocate 0 memory for the array p, it still stores values properly. So why am I even bothering with malloc then?

It might appear to work fine, but it isn't very safe at all. By writing data outside the allocated block of memory you are overwriting some data you shouldn't. This is one of the greatest causes of segfaults and other memory errors, and what you're observing with it appearing to work in this short program is what makes it so difficult to hunt down the root cause.
Read this article, in particular the part on memory corruption, to begin understanding the problem.
Valgrind is an excellent tool for analysing memory errors such as the one you provide.
#David made a good comment. Compare the results of running your code to running the following code. Note the latter results in a runtime error (with pretty much no useful output!) on ideone.com (click on links), whereas the former succeeds as you experienced.
int main(void)
{
int *p;
p=malloc(sizeof(int));
printf("size of p=%d\n",sizeof(p));
p[500]=999999;
printf("p[0]=%d",p[500]);
p[500000]=42;
printf("p[0]=%d",p[500000]);
return 0;
}

If you don't allocate memory, p has garbage in it, so writing to it likely will fail. Once you made a valid malloc call, p is pointing to valid memory location and you can write to it. You are overwriting memory that you shouldn't write to, but nobody's going to hold your hand and tell you about it. If you run your program and a memory debugger such as valgrind, it will tell you.
Welcome to C.

Writing past the end of your memory is Undefined Behaviour™, which means that anything could happen- including your program operating as if what you just did was perfectly legal. The reason for your program running as if you had done malloc(501*sizeof(int)) are completely implementation-specific, and can indeed be specific to anything, including the phase of the moon.

This is because P would be assigned some address no matter what size you use with malloc(). Although, with a zero size you would be referencing invalid memory as the memory hasn't been allocated, but it may be within a location which wouldn't cause program crash, though the behavior will be undefined.
Now if you do not use malloc(), it would be pointing to a garbaging location and trying to access that is likely to cause program crash.

I tried it with malloc(0*sizeof(int))
According to C99 if the size passed to malloc is 0, a C runtime can return either a NULL pointer or the allocation behaves as if the request was for non-zero allocation, except that the returned pointer should not be dereferenced. So it is implementation defined (e.g. some implementations return a zero-length buffer) and in your case you do not get a NULL pointer back, but you are using a pointer you should not be using.If you try it in a different runtime you could get a NULL pointer back.

When you call malloc() a small chunk of memory is carved out of a larger page for you.
malloc(sizeof(int));
Does not actually allocate 4 bytes on a 32bit machine (the allocator pads it up to a minimum size) + size of heap meta data used to track the chunk through its lifetime (chunks are placed in bins based on their size and marked in-use or free by the allocator). hxxp://en.wikipedia.org/wiki/Malloc or more specifically hxxp://en.wikipedia.org/wiki/Malloc#dlmalloc_and_its_derivatives if you're testing this on Linux.
So writing beyond the bounds of your chunk doesn't necessarily mean you are going to crash. At p+5000 you are not writing outside the bounds of the page allocated for that initial chunk so you are technically writing to a valid mapped address. Welcome to memory corruption.
http://www.google.com/search?sourceid=chrome&ie=UTF-8&q=heap+overflows

Our CheckPointer tool can detect this error. It knows that the allocation of p was to a chunk of 4 bytes, and thus the assignment is made, it is outside the area for which p was allocated. It will tell you that the p[500] assignment is wrong.