The problem seems to be the "*p = 20;" command, although I simply do not get why. Whenever I add it, I get the error "stack around the variable 'var' was corrupted".
main(void)
{
int* p;
int var;
p = &var;
*p = 16;
p++;
*p = 20;
system("pause");
}
After this statement
p++;
the pointer p does not point to a valid object (it points now to the memory beyond the object var of the type int). Thus this statement
*p = 20;
results in undefined behavior. That is
stack around the variable 'var' was corrupted
Vlad from Moscow's answer is correct. Here's some additional information on why that is that I thought would be useful.
The reason you are getting this error is because you're overwriting protected memory on the stack, which Vlad from Moscow has shown to you. compilers will protect you against stack corruption. There are various ways to implement this, but one way is to use a "canary," which is just a value or values at a specific memory location. If those values are changed, the compiler knows that the stack was corrupted, and can give you an error message like "stack around the variable 'var' was corrupted." In your case, the memory location 4 bytes beyond the variable var is probably a canary value on the stack (if your particular comipler uses canaries), which you changing, and that causes the error.
See the Wikipedia article on buffer overflow protection for more information.
Here's an excerpt from the article:
Typically, buffer overflow protection modifies the organization of stack-allocated data so it includes a canary value that, when destroyed by a stack buffer overflow, shows that a buffer preceding it in memory has been overflowed. By verifying the canary value, execution of the affected program can be terminated, preventing it from misbehaving or from allowing an attacker to take control over it. Other buffer overflow protection techniques include bounds checking, which checks accesses to each allocated block of memory so they cannot go beyond the actually allocated space, and tagging, which ensures that memory allocated for storing data cannot contain executable code.
Edit:
I actually really like the next paragraph in the wiki article as well:
Overfilling a buffer allocated on the stack is more likely to influence program execution than overfilling a buffer on the heap because the stack contains the return addresses for all active function calls. However, similar implementation-specific protections also exist against heap-based overflows.
Related
I came across a part of question in which, I am getting an output, but I need a explanation why it is true and does work?
char arr[4];
strcpy(arr,"This is a link");
printf("%s",arr);
When I compile and execute, I get the following output.
Output:
This is a link
The short answer why it worked (that time) is -- you got lucky. Writing beyond the end of an array is undefined behavior. Where undefined behavior is just that, undefined, it could just a easily cause a segmentation fault as it did produce output. (though generally, stack corruption is the result)
When handling character arrays in C, you are responsible to insure you have allocated sufficient storage. When you intend to use the array as a character string, you also must allocate sufficient storage for each character +1 for the nul-terminating character at the end (which is the very definition of a nul-terminated string in C).
Why did it work? Generally, when you request say char arr[4]; the compiler is only guaranteeing that you have 4-bytes allocated for arr. However, depending on the compiler, the alignment, etc. the compiler may actually allocate whatever it uses as a minimum allocation unit to arr. Meaning that while you have only requested 4-bytes and are only guaranteed to have 4-usable-bytes, the compiler may have actually set aside 8, 16, 32, 64, or 128, etc-bytes.
Or, again, you were just lucky that arr was the last allocation requested and nothing yet has requested or written to the memory address starting at byte-5 following arr in memory.
The point being, you requested 4-bytes and are only guaranteed to have 4-bytes available. Yes it may work in that one printf before anything else takes place in your code, but your code is wholly unreliable and you are playing Russian-Roulette with stack corruption (if it has not already taken place).
In C, the responsibility falls to you to insure your code, storage and memory use is all well-defined and that you do not wander off into the realm of undefined, because if you do, all bets are off, and your code isn't worth the bytes it is stored in.
How could you make your code well-defined? Appropriately limit and validate each required step in your code. For your snippet, you could use strncpy instead of strcpy and then affirmatively nul-terminate arr before calling printf, e.g.
char arr[4] = ""; /* initialize all values */
strncpy(arr,"This is a link", sizeof arr); /* limit copy to bytes available */
arr[sizeof arr - 1] = 0; /* affirmatively nul-terminate */
printf ("%s\n",arr);
Now, you can rely on the contents of arr throughout the remainder of your code.
Your code has some memory issues (buffer overrun) . The function strcpy copies bytes until the null character. The function printf prints until the null character.
There is no guarantee on the behavior of this piece of code.
It's just like: you told me "I'll pick you up at 5:00 p.m." and when you came I would be there(guarantee). But I can't guarantee whether I had grabbed you a cup of coffee or not, because you didn't told me you want one. Maybe I'm very nice and bought two cups of coffee, or maybe I'm a cheapskate and just bought one for myself.
It may work. It may not. It may fail immediately and obviously. It may fail at some arbitrary future time and in subtle ways that will drive you insane.
That is the often-insidious nature of undefined behaviour. Don't do it.
If it works at all, it's totally by accident and in no way guaranteed. It's possible that you're overwriting stuff on the stack or in other memory (depending on the implementation and how/where the actual variable str is defined(a)) but that the memory being overwritten is not used after that point (given the simple nature of the code).
That possibility of it working accidentally in no way makes it a good idea.
For the language lawyers among us, section J.2 (instances of undefined behaviour) of C11 clearly states:
An array subscript is out of range, even if an object is apparently accessible with the given subscript (as in the lvalue expression a[1][7] given the declaration int a[4][5]).
That informative section references 6.5.6, which is normative, and which states when discussing pointer/integer addition (of which a[b] is an example):
If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined. If the result points one past the last element of the array object, it shall not be used as the operand of a unary * operator that is evaluated.
(a) For example, on my system, declaring the variable inside main causes the program to crash because the buffer overflow trashes the return address on the stack.
However, if I put the declaration at file level (outside of main), it seems to run just fine, printing the message then exiting the program.
But I assure you that's only because the memory you've trashed is not important for the continuation of the program in this case. It will almost certainly be important in anything more substantial than this example.
your code will always work as long as the printf is placed just after strcpy. But it is wrong coding
Try following and it won't work
int j;
char arr[4];
int i;
strcpy(arr,"This is a link");
i=0;
j=0;
printf("%s",arr);
To understand why it is so you must understand the idea of stack. All local variables are allocated on stack. Hence in your code, program control has allocated 4 bytes for "arr" and when you copy a string which is larger than 4 bytes then you are overwriting/corrupting some other memory. But as you accessed "arr" just after strcpy hence the area you have overwritten which may belong to some other variables still not updated by program that's why your printf works fine. But as I suggested in example code where other variables are updated which fall into the memory region you have overwritten, you won't get correct (? or more appropriate is desired) output
Your code is working also because stack grows downwards if it would have been other way then also you had not get desired output
#include <stdio.h>
#include <stdlib.h>
int main()
{
int *a;
a = (int *)malloc(100*sizeof(int));
int i=0;
for (i=0;i<100;i++)
{
a[i] = i+1;
printf("a[%d] = %d \n " , i,a[i]);
}
a = (int*)realloc(a,75*sizeof(int));
for (i=0;i<100;i++)
{
printf("a[%d] = %d \n " , i,a[i]);
}
free(a);
return 0;
}
In this program I expected the program to give me a segmentation fault because im trying to access an element of an array which is freed using realloc() . But then the output is pretty much the same except for a few final elements !
So my doubt is whether the memory is actually getting freed ? What exactly is happening ?
The way realloc works is that it guarantees that a[0]..a[74] will have the same values after the realloc as they did before it.
However, the moment you try to access a[75] after the realloc, you have undefined behaviour. This means that the program is free to behave in any way it pleases, including segfaulting, printing out the original values, printing out some random values, not printing anything at all, launching a nuclear strike, etc. There is no requirement for it to segfault.
So my doubt is whether the memory is actually getting freed?
There is absolutely no reason to think that realloc is not doing its job here.
What exactly is happening?
Most likely, the memory is getting freed by shrinking the original memory block and not wiping out the now unused final 25 array elements. As a result, the undefined behaviour manifests itself my printing out the original values. It is worth noting that even the slightest changes to the code, the compiler, the runtime library, the OS etc could make the undefined behaviour manifest itself differently.
You may get a segmentation fault, but you may not. The behaviour is undefined, which means anything can happen, but I'll attempt to explain what you might be experiencing.
There's a mapping between your virtual address space and physical pages, and that mapping is usually in pages of 4096 bytes at least (well, there's virtual memory also, but lets ignore that for the moment).
You get a segmentation fault if you attempt to address virtual address space that doesn't map to a physical page. So your call to realloc may not have resulted in a physical page being returned to the system, so it's still mapped to you program and can be used. However a following call to malloc could use that space, or it could be reclaimed by the system at any time. In the former case you'd possibly overwrite another variable, in the latter case you'll segfault.
Accessing an array beyond its bounds is undefined behaviour. You might encounter a runtime error. Or you might not. The memory manager may well have decided to re-use the original block of memory when you re-sized. But there's no guarantee of that. Undefined behaviour means that you cannot reason about or predict what will happen. There's no grounds for you to expect anything to happen.
Simply put, don't access beyond the end of the array.
Some other points:
The correct main declaration here is int main(void).
Casting the value returned by malloc is not needed and can mask errors. Don't do it.
Always store the return value of realloc into a separate variable so that you can detect NULL being returned and so avoid losing and leaking the original block.
Program was programmed in C and compiled with GCC.
I was trying to help a friend who was trying to use trying to (shallow) copy a value that was passed into a function. His the value was a struct that held primitives and pointers (no arrays or buffers). Unsure of how malloc works, he used it similar to how the following was done:
void some_function(int rand_params, SOME_STRUCT_TYPEDEF *ptr){
SOME_STRUCT_TYPEDEF *cpy;
cpy = malloc(sizeof(SOME_STRUCT_TYPEDEF));// this line makes a difference?!?!?
cpy = ptr;// overwrites cpy anyway, right?
//prints a value in the struct documented to be a char*,
//sorry couldn't find the documentation right now
}
I told him that the malloc shouldn't affect the program, so told him to comment it out. To my surprise, the malloc caused a different output (with some intended strings) from the implementation with the malloc commented out (prints our garbage values). The pointer that's passed into the this function is from some other library function which I don't have documentation for at the moment. The best I can assume it that the pointer was for a value that was actually a buffer (that was on the stack). But I still don't see how the malloc can cause such a difference. Could someone explain how that malloc may cause a difference?
I would say that the evident lack of understanding of pointers is responsible for ptr actually pointing to memory that has not been correctly allocated (if at all), and you are experiencing undefined behaviour. The issue is elsewhere in the program, prior to the call to some_function.
As an aside, the correct way to allocate and copy the data is this:
SOME_STRUCT_TYPEDEF *cpy = malloc(sizeof(SOME_STRUCT_TYPEDEF));
if (cpy) {
*cpy = *ptr;
// Don't forget to clean up later
free(cpy);
}
However, unless the structure is giant, it's a bit silly to do it on the heap when you can do it on the stack like this:
SOME_STRUCT_TYPEDEF cpy = *ptr;
I can't see why there difference in the print.
can you show the print code?
anyway the malloc causes memory leak. you're not supposed to allocate memory for 'cpy' because pointer assignment is not shallow-copy, you simply make 'cpy' point to same memory 'ptr' point by storing the address of the start of that memory in 'cpy' (cpy is mostly a 32/64 bit value that store address, in case of malloc, it will store the address of the memory section you allocated)
I'm a bit confused about malloc() function.
if sizeof(char) is 1 byte and the malloc() function accepts N bytes in argument to allocate, then if I do:
char* buffer = malloc(3);
I allocate a buffer that can to store 3 characters, right?
char* s = malloc(3);
int i = 0;
while(i < 1024) { s[i] = 'b'; i++; }
s[i++] = '$';
s[i] = '\0';
printf("%s\n",s);
it works fine. and stores 1024 b's in s.
bbbb[...]$
why doesn't the code above cause a buffer overflow? Can anyone explain?
malloc(size) returns a location in memory where at least size bytes are available for you to use. You are likely to be able to write to the bytes immediately after s[size], but:
Those bytes may belong to other bits of your program, which will cause problems later in the execution.
Or, the bytes might be fine for you to write to - they might belong to a page your program uses, but aren't used for anything.
Or, they might belong to the structures that malloc() has used to keep track of what your program has used. Corrupting this is very bad!
Or, they might NOT belong to your program, which will result in an immediate segmentation fault. This is likely if you access say s[size + large_number]
It's difficult to say which one of these will happen because accessing outside the space you asked malloc() for will result in undefined behaviour.
In your example, you are overflowing the buffer, but not in a way that causes an immediate crash. Keep in mind that C does no bounds checking on array/pointer accesses.
Also, malloc() creates memory on the heap, but buffer overflows are usually about memory on the stack. If you want to create one as an exercise, use
char s[3];
instead. This will create an array of 3 chars on the stack. On most systems, there won't be any free space after the array, and so the space after s[2] will belong to the stack. Writing to that space can overwrite other variables on the stack, and ultimately cause segmentation faults by (say) overwriting the current stack frame's return pointer.
One other thing:
if sizeof(char) is 1 byte
sizeof(char) is actually defined by the standard to always be 1 byte. However, the size of that 1 byte might not be 8 bits on exotic systems. Of course, most of the time you don't have to worry about this.
It is Undefined Behavior(UB) to write beyond the bounds of allocated memory.
Any behavior is possible, no diagnostic is needed for UB & any behavior can be encountered.
An UB does not necessarily warrant a segmentation fault.
In a way, you did overflow your 3 character buffer. However, you did not overflow your program's address space (yet). So you are well out of the bounds of s*, but you are overwriting random other data in your program. Because your program owns this data, the program doesn't crash, but still does very very wrong things, and the future behaviour is undefined.
In practice what this is doing is corrupting the heap. The effects may not appear immediately (in fact, that's part of what makes such errors a PITA to debug). However, you may trash anything else that happens to be in the heap, or in that part of your program's address space for that matter. It's likely that you have also trashed malloc() internal data structures, and so it's likely that subsequent malloc() or free() calls may crash your program, leading many programmers to (falsely) believe they've found a bug in malloc().
You're overflowing the buffer. It depends what memory you're overflowing into to get an error msg.
Did you try executing your code in release mode or did you try to free up the memory you of s? It is an undefined behavior.
It's a bit of a language hack, and a bit dubious about it's use.
help me in understanding the malloc behaviour.. my code is as follows::
int main()
{
int *ptr=NULL;
ptr=(int *)malloc(1);
//check for malloc
*ptr=1000;
printf("address of ptr is %p and value of ptr is %d\n",ptr,*ptr);
return 0;
}
the above program works fine(runs without error)...how?? as I have supplied a value of 1000 in 1 byte only!!
Am I overwriting the next memory addresss in heap?
if yes, then why not sigsgev is there?
Many implementations of malloc will allocate at a certain "resolution" for efficiency.
That means that, even though you asked for one byte, you may well have gotten 16 or 32.
However, it's not something you can rely on since it's undefined behaviour.
Undefined behaviour means that anything can happen, including the whole thing working despite the problematic code :-)
Using a debug heap you will definitely get a crash or some other notification when you freed the memory (but you didn't call free).
Segmentation faults are for page-level access violations, and a memory page is usually on the order of 4k, so an overrun by 3 bytes isn't likely to be detected until some finer grained check detects it or some other part of your code crashes because you overwrote some memory with 'garbage'