C strange int array pointer - c

int *array; //it allocate a pointer to an int right?
array=malloc(sizeof(int); //allocate the space for ONE int right?
scanf("%d", &array[4]); //must generate a segmentation fault cause there is not allocated space enough right?
printf("%d",array[4]); //it shouldn't print nothing...
but it print 4! why?

Reading or writing off the end of an array in C results in undefined behavior, meaning that absolutely anything can happen. It can crash the program, or format the hard drive, or set the computer on fire. In your case, it just so happens to be the case that this works out perfectly fine, probably because malloc pads the length of the allocated memory for efficiency reasons (or perhaps because you're trashing memory that malloc wants to use later on). However, this isn't at all portable, isn't guaranteed to work, and is a disaster waiting to happen.
Hope this helps!

Cause the operating system happens to have this memory you have asked. This code is by no means guaranteed to run on another machine or another time.

C doesn't check your code for getting of boundaries of an array size, that depends to the programmer. I guess you can call it an undefined behavior (although it's not exactly what they mean when they say an undefined behavior because mostly the memory would be in a part of the stack or the heap so you can still get to it.
When you say array[4] you actually say *(array + 4) which is of course later translated to *(array + 4*sizeof(int)) and you actually go to a certain space in the memory, the space exists, it could be maybe a read-only, maybe a place of another array or variable in your program or it might just work perfectly. No guarantee it'll be an error, but it's not what undefined behavior.
To understand more about undefined behavior you can go to this article (which I find very interesting).

Related

Is my malloc function allocating more than I intend to

Here is the code I'm trying to run. The malloc function allocates 800 bytes.
void DynamicMemoryAllocationUsingMalloc()
{
int* p,i;
if((p = (int*) malloc(800)) == NULL)
{
printf("\n Out of Memory \n");
exit(0);
}
for(i =0;i < 800;i++)
{
printf(" 0x%x", (p + i));
printf(" %d\n", *(p + i));
}
}
But inside the for loop when I print the addresses I'm able to hop through 800 memory locations (using integer pointer p) each 4 byte long (size of an integer) safely which amounts to 3200 bytes. How is that possible or I'm being lucky not be getting access violation error even though I'm actually entering into a memory area which I've not yet allocated for my program? I see garbage being written in all the memory locations for the obvious reason as I've not set those memory location to anything.
Note: It is a C program being run on Windows 7.
How is that possible or I'm being lucky not be getting access violation error even though I'm actually entering into a memory area which I've not yet allocated for my program?
When code reaches printf(" %d\n", *(p + 200)); is is attempting to read outside allocated memory. That is undefined behavior UB.
UB is UB. It may happen like this every day or change the next time you run.
You are not lucky. Lucky would be for your code to stop right there.
Even reading uninitialized int data is UB. So code has UB (or maybe implementation defined) as soon as printf(" %d\n", *(p + 0));. IAC, code could have stopped right there.
Is my malloc function allocating more than I intend to?
That is the tricky bit. Code that invokes UB creates questionable results. Code without UB has no standard way to test the question. The only non-UB way to determine this is if the library supplies a function with the answer.
printf("True size %lu\n", (unsigned long) True_size(p));
Note: OP asserts int is 4 bytes.
It's working probably because you're not reaching past the range of memory currently allocated for the process. Modern systems typically allocate memory to a process in 4-kilobyte pages. Your first allocation is possibly at the beginning of a page, and the memory you're snooping in is probably in the unallocated remainder of that first page.
The OS can't detect invalid memory accesses unless they go beyond the ranges of memory allocated for your process. As far as the OS is concerned, it gave your process that page and the process is using it. It doesn't care whether or not the malloc routine used by your process has said your program "owns" that memory yet.
It might be a fun experiment to see how far you can read before you get an access violation. Just loop and print each address out before you try to read it.
But inside the for loop when I print the addresses I'm able to hop through 800 memory locations (using integer pointer p) each 4 byte long (size of an integer) safely which amounts to 3200 bytes.
Where I guess by "safely" you mean that the program does not crash. This is an ok definition when applied to air travel, but not so appropriate for a computer program.
How is that possible or I'm being lucky not be getting access violation error even though I'm actually entering into a memory area which I've not yet allocated for my program?
By accessing unallocated memory your program exhibits undefined behavior. You are to be commended for recognizing the problem. To quote #KerrekSB, however, "Undefined behavior is undefined". Generally speaking, you cannot assume any particular manifestation of undefined behavior.
If your program did happen to crash, with an access violation, for example, then you could be sure that it had exhibited undefined behavior, simply because C does not define any way to produce that behavior. But just because it seems to do what you expect does not mean that its behavior is defined. If it is not defined then, generally speaking, you cannot be confident that it will be consistent, either.
So basically, yes, you're just lucky. Or maybe unlucky. Myself, I'd rather have the program crash, so that I am alerted to the problem.
This here be an example of undefined behavior.
Logically, this program should break. but it does not because the process image has some extra space that you can overflow into without having the operating system sending a segfault. I mean, instead of going to 800, go up to 1000, 10000, and so on. Eventually you'll get a segfault at some arbitrary number of iterations.
The reason you can go so high is because your program has a lot of overhead in the ram, this overhead is allowed to be overflowed into.
C standard answer: Accessing memory beyond what you have allocated results in undefined behaviour.
Real world answer:Your program allocates memory using malloc() which provisions it from the operating system, in this case Windows. However, each malloc() call doesn't result in a call to the operating system. Malloc will in fact usually allocate a bit more memory than it needs right now and will then break off a chunk which is at least the size you requested. This is done for performance reasons since every call to the operating system has a bit of overhead. Also, there is a minimum "page size" which is the smallest unit of memory which can be allocated from the operating system. 4096 bytes is a typical page size.
So in your case, you are accessing the memory which malloc has provisioned from the system but has not allocated for use. You should avoid this since the next call to malloc might cause the memory to be allocated for another purpose.

C malloc function size

So I have this little tricky question I need to answer:
On which segment in memory is c+9 pointing to if the function is:
void f()
{
int *c=(int*)malloc(10);
}
I think I know how malloc works, and I looked up other questions, so this should allocate 10 bytes of memory and return the adress of the first one, plus it will cast it to int.
So because sizeof(int) is 4 bytes, I thought that there wouldn't be enough space for 9 integers, and c+9 would point to some memory outside the range of the allocated memory and it would return an error, but no, the program works just fine, as if there was 10*sizeof(int) allocated. So where am I making a mistake?
Your mistake is believing that it works just fine.
Yes, it probably runs, yes, it probably doesn't segfault, but no, it is not correct.
So because sizeof(int) is 4 bytes, I thought that there wouldn't be enough space for 9 integers, and c+9 would point to some memory outside the range of the allocated memory
This is correct
and it would return an error
But this is unfortunately not correct in every case. The OS can only give out space in full pages, this means you can only get space in multiples of 4096 bytes (one page). This means, even though malloc (which is implemented in userspace) gives you 10 bytes, your program will have at least 4096 bytes from the OS. BUT: malloc will eventually give you out more unallocated space from this one page you got and then it will probably introduce a bug.
TLDR: This is UB, even though it looks like it works, never do this.
You're making the mistake in assuming that undefined behavior implies that something "bad" is guaranteed to happen. Accessing c[9] in your case is undefined behavior as you haven't malloced enough memory - and is something that you should not do.
Undefined behavior means that the standard allow for any behavior. For this particular error you would often get an non-localized misbehavior, accessing c[9] would work apparently fine and no odd things happens when you do it, but then in an unrelated piece of code accessing an unrelated piece of data results in error. Often these kind of mistakes would also corrupt the data used by the memory allocation system which may make malloc and free to misbehave.
C programs will not return an error if you poke outside of the assigned memory range. The result is not defined, it may hang or (apparently) work fine. But it is not fine.
You are right in that malloc gives you 10 characters (usually 8-bit bytes). Allocating an area for ints that is not a multiple of int size is in itself fishy... but not illegal. The resulting address is interpreted as a pointer to an int (typically 32 bits), and you are asking for the address 9 int beyond the start of the allocated area. That in itself is fine, but trying to access that is undefined behaviour: anything might happen, including whatever you expect naïvely, nothing whatsoever, crash, or end the universe. What will usually happen is that you get assigned memory from a bigger area containing other objects and free space (and the extra data malloc uses to keep track of the whole mess). Reading there causes no harm, writing could damage other data or mess up malloc's data structures, leading to mysterious behaviour/crashes later on. If you are lucky, the new space is allocated at a boundary, and out-of-limits access gives a segmentation fault, pointing at the culprit.

C Array Segmentation Faults only after a certain threshold

I'm looking at this simple program. My understanding is that trying to modify the values at memory addresses past the maximum index should result in a segmentation fault. However, the following code runs without any problems. I am even able to print out all 6 of the array indexes 0->6
main()
{
int i;
int a[3];
for(i=0; i<6; i++)
a[i] = i;
}
However, when I change the for loop to
for(i=0; i<7; i++)
and executing the program, it will segfault.
This almost seems to me like it is some kind of extra padding done by malloc. Why does this happen only after the 6th index (s+6)? Will this behavior happen with longer/shorter arrays? Excuse my foolishness as lowly java programmer :)
This almost seems to me like it is some kind of extra padding done by malloc.
You did not call malloc(). You declared a as an array of 3 integers in stack memory, whereas malloc() uses heap memory. In both cases, accessing any element past the last one (the third one, a[2], in this case) is undefined behaviour. When it's done with heap memory, it usually causes a segmentation fault.
Well, malloc didn't do it, because you didn't call malloc. My guess is, the extra three writes were enough to chew through your frame pointer and your stack pointer, but not through your return address (but the seventh one hit that). Your program is not guaranteed to crash when you access out-of-bounds memory (though there are other languages which do guarantee it), any more than it is guaranteed not to crash. That's what undefined behavior is: unpredictable.
As others said, accessing an array beyond its limits is undefined behaviour. So what happens? If you access memory that you should not access, it depends on where that memory is.
Generally (but very generally, specifics can vary between systems and compilers), the following main things can happen:
It can be that you simply access other variables of your process that lie directly "behind" the array. If you write to that memory, you simply modify the values of the other variables. You will probably not get a segfault, so you may never notice why your program produces bad results or acts so weird or why, during debugging, your variables have values you never (knowlingly) assigned to them. This is, IMO, really bad, because you think everything is fine while it isn't.
It can be, especially on a stack, that you access other data on the stack, like saved processor registers or even the address to which the processor should return if the function ends. If you overwrite these with some other data, it is hard to tell what happens. In that case, a segfault is probably the lesser of all possible evils. You simply don't know what can happen.
If the memory beyond your array does not belong to your process then, on most modern computers, you will get a segfault or similar exception (e.g. an access violation, or whatever your OS calls it).
I may have forgotten a few more possible problems that can occur, but those are, IMO, the most usual things that happen if you write beyond array bounds.
It depends on the free memory available, if free memory available is less then it will give segmentation fault otherwise it will use the extra memory to store the data and it will not be giving segmentation fault.There is no need for malloc because array itself allocates memory.
In your system memory is available only for 6 integers and when you are trying to access to next memory(which is not accessible or say not free)it is giving segmentation fault.

C program help: Insufficient memory allocation but still works...why? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
behaviour of malloc(0)
I'm trying to understand memory allocation in C. So I am experimenting with malloc. I allotted 0 bytes for this pointer but yet it can still hold an integer. As a matter of fact, no matter what number I put into the parameter of malloc, it can still hold any number I give it. Why is this?
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
int *ptr = (int*)malloc(0);
*ptr = 9;
printf("%i", *ptr); // 9
free(ptr);
return 0;
}
It still prints 9, what's up with that?
If size is 0, then malloc() returns either NULL, or a unique pointer
value that can later be successfully passed to free().
I guess you are hitting the 2nd case.
Anyway that pointer just by mistake happens to be in an area where you can write without generating segmentation fault, but you are probably writing in the space of some other variable messing up its value.
A lot of good answers here. But it is definitely undefined behavior. Some people declare that undefined behavior means that purple dragons may fly out of your computer or something like that... there's probably some history behind that outrageous claim that I'm missing, but I promise you that purple dragons won't appear regardless of what the undefined behavior will be.
First of all, let me mention that in the absence of an MMU, on a system without virtual memory, your program would have direct access to all of the memory on the system, regardless of its address. On a system like that, malloc() is merely the guy who helps you carve out pieces of memory in an ordered manner; the system can't actually enforce you to use only the addresses that malloc() gave you. On a system with virtual memory, the situation is slightly different... well, ok, a lot different. But within your program, any code in your program can access any part of the virtual address space that's mapped via the MMU to real physical memory. It doesn't matter whether you got an address from malloc() or whether you called rand() and happened to get an address that falls in a mapped region of your program; if it's mapped and not marked execute-only, you can read it. And if it isn't marked read-only, you can write it as well. Yes. Even if you didn't get it from malloc().
Let's consider the possibilities for the malloc(0) undefined behavior:
malloc(0) returns NULL.
OK, this is simple enough. There really is a physical address 0x00000000 in most computers, and even a virtual address 0x00000000 in all processes, but the OS intentionally doesn't map any memory to that address so that it can trap null pointer accesses. There's a whole page (generally 4KB) there that's just never mapped at all, and maybe even much more than 4KB. Therefore if you try to read or write through a null pointer, even with an offset from it, you'll hit these pages of virtual memory that aren't even mapped, and the MMU will throw an exception (a hardware exception, or interrupt) that the OS catches, and it declares a SIGSEGV (on Linux/Unix), or an illegal access (on Windows).
malloc(0) returns a valid address to previously unallocated memory of the smallest allocable unit.
With this, you actually get a real piece of memory that you can legally call your own, of some size you don't know. You really shouldn't write anything there (and probably not read either) because you don't know how big it is, and for that matter, you don't know if this is the particular case you're experiencing (see the following cases). If this is the case, the block of memory you were given is almost guaranteed to be at least 4 bytes and probably is 8 bytes or perhaps even larger; it all depends on whatever the size is of your implementation's minimum allocable unit.
malloc(0) intentionally returns the address of an unmapped page of
memory other than NULL.
This is probably a good option for an implementation, as it would allow you or the system to track & pair together malloc() calls with their corresponding free() calls, but in essence, it's the same as returning NULL. If you try to access (read/write) via this pointer, you'll crash (SEGV or illegal access).
malloc(0) returns an address in some other mapped page of memory
that may be used by "someone else".
I find it highly unlikely that a commercially-available system would take this route, as it serves to simply hide bugs rather than bring them out as soon as possible. But if it did, malloc() would be returning a pointer to somewhere in memory that you do not own. If this is the case, sure, you can write to it all you want, but you'd be corrupting some other code's memory, though it would be memory in your program's process, so you can be assured that you're at least not going to be stomping on another program's memory. (I hear someone getting ready to say, "But it's UB, so technically it could be stomping on some other program's memory. Yes, in some environments, like an embedded system, that is right. No modern commercial OS would let one process have access to another process's memory as easily as simply calling malloc(0) though; in fact, you simply can't get from one process to another process's memory without going through the OS to do it for you.) Anyway, back to reality... This is the one where "undefined behavior" really kicks in: If you're writing to "someone else's memory" (in your own program's process), you'll be changing the behavior of your program in difficult-to-predict ways. Knowing the structure of your program and where everything is laid out in memory, it's fully predictable. But from one system to another, things would be laid out in memory (appearing a different locations in memory), so the effect on one system would not necessarily be the same as the effect on another system, or on the same system at a different time.
And finally.... No, that's it. There really, truly, are only those four
possibilities. You could argue for special-case subset points for
the last two of the above, but the end result will be the same.
For one thing, your compiler may be seeing these two lines back to back and optimizing them:
*ptr = 9;
printf("%i", *ptr);
With such a simplistic program, your compiler may actually be optimizing away the entire memory allocate/free cycle and using a constant instead. A compiler-optimized version of your program could end up looking more like simply:
printf("9");
The only way to tell if this is indeed what is happening is to examine the assembly that your compiler emits. If you're trying to learn how C works, I recommend explicitly disabling all compiler optimizations when you build your code.
Regarding your particular malloc usage, remember that you will get a NULL pointer back if allocation fails. Always check the return value of malloc before you use it for anything. Blindly dereferencing it is a good way to crash your program.
The link that Nick posted gives a good explanation about why malloc(0) may appear to work (note the significant difference between "works" and "appears to work"). To summarize the information there, malloc(0) is allowed to return either NULL or a pointer. If it returns a pointer, you are expressly forbidden from using it for anything other than passing it to free(). If you do try to use such a pointer, you are invoking undefined behavior and there's no way to tell what will happen as a result. It may appear to work for you, but in doing so you may be overwriting memory that belongs to another program and corrupting their memory space. In short: nothing good can happen, so leave that pointer alone and don't waste your time with malloc(0).
The answer to the malloc(0)/free() calls not crashing you can find here:
zero size malloc
About the *ptr = 9, is just like overflowing a buffer (like malloc'ing 10 bytes and access the 11th), you are writing to memory you don't own, and doing that is looking for trouble. In this particular implementation malloc(0) happens to return a pointer instead of NULL.
Bottom line, it is wrong even if it seems to work on a simple case.
Some memory allocators have the notion of "minimum allocatable size". So, even if you pass zero, this will return pointer to the memory of word-size, for example. You need to check up with your system allocator documentation. But if it does return pointer to some memory it'd be wrong to rely on it as the pointer is only supposed to be passed either to be passed realloc() or free().

Writing to pointer out of bounds after malloc() not causing error

when I try the code below it works fine. Am I missing something?
main()
{
int *p;
p=malloc(sizeof(int));
printf("size of p=%d\n",sizeof(p));
p[500]=999999;
printf("p[0]=%d",p[500]);
return 0;
}
I tried it with malloc(0*sizeof(int)) or anything but it works just fine. The program only crashes when I don't use malloc at all. So even if I allocate 0 memory for the array p, it still stores values properly. So why am I even bothering with malloc then?
It might appear to work fine, but it isn't very safe at all. By writing data outside the allocated block of memory you are overwriting some data you shouldn't. This is one of the greatest causes of segfaults and other memory errors, and what you're observing with it appearing to work in this short program is what makes it so difficult to hunt down the root cause.
Read this article, in particular the part on memory corruption, to begin understanding the problem.
Valgrind is an excellent tool for analysing memory errors such as the one you provide.
#David made a good comment. Compare the results of running your code to running the following code. Note the latter results in a runtime error (with pretty much no useful output!) on ideone.com (click on links), whereas the former succeeds as you experienced.
int main(void)
{
int *p;
p=malloc(sizeof(int));
printf("size of p=%d\n",sizeof(p));
p[500]=999999;
printf("p[0]=%d",p[500]);
p[500000]=42;
printf("p[0]=%d",p[500000]);
return 0;
}
If you don't allocate memory, p has garbage in it, so writing to it likely will fail. Once you made a valid malloc call, p is pointing to valid memory location and you can write to it. You are overwriting memory that you shouldn't write to, but nobody's going to hold your hand and tell you about it. If you run your program and a memory debugger such as valgrind, it will tell you.
Welcome to C.
Writing past the end of your memory is Undefined Behaviour™, which means that anything could happen- including your program operating as if what you just did was perfectly legal. The reason for your program running as if you had done malloc(501*sizeof(int)) are completely implementation-specific, and can indeed be specific to anything, including the phase of the moon.
This is because P would be assigned some address no matter what size you use with malloc(). Although, with a zero size you would be referencing invalid memory as the memory hasn't been allocated, but it may be within a location which wouldn't cause program crash, though the behavior will be undefined.
Now if you do not use malloc(), it would be pointing to a garbaging location and trying to access that is likely to cause program crash.
I tried it with malloc(0*sizeof(int))
According to C99 if the size passed to malloc is 0, a C runtime can return either a NULL pointer or the allocation behaves as if the request was for non-zero allocation, except that the returned pointer should not be dereferenced. So it is implementation defined (e.g. some implementations return a zero-length buffer) and in your case you do not get a NULL pointer back, but you are using a pointer you should not be using.If you try it in a different runtime you could get a NULL pointer back.
When you call malloc() a small chunk of memory is carved out of a larger page for you.
malloc(sizeof(int));
Does not actually allocate 4 bytes on a 32bit machine (the allocator pads it up to a minimum size) + size of heap meta data used to track the chunk through its lifetime (chunks are placed in bins based on their size and marked in-use or free by the allocator). hxxp://en.wikipedia.org/wiki/Malloc or more specifically hxxp://en.wikipedia.org/wiki/Malloc#dlmalloc_and_its_derivatives if you're testing this on Linux.
So writing beyond the bounds of your chunk doesn't necessarily mean you are going to crash. At p+5000 you are not writing outside the bounds of the page allocated for that initial chunk so you are technically writing to a valid mapped address. Welcome to memory corruption.
http://www.google.com/search?sourceid=chrome&ie=UTF-8&q=heap+overflows
Our CheckPointer tool can detect this error. It knows that the allocation of p was to a chunk of 4 bytes, and thus the assignment is made, it is outside the area for which p was allocated. It will tell you that the p[500] assignment is wrong.

Resources