Explaining Heap Based Buffer Overflow to a beginner - heap-memory

I hope you're well. I'm trying to get my head around the 'title' and how it works. Can anyone possibly explain in simple terms?
Thanks again everyone.

I hope this helps you:
First of all, the heap is a memory structure used to manage dynamic memory.
In the case of the heap based buffer overflow,
programmers often use the heap to allocate memory whose size is not known at compile time, where the amount of memory required is too large to fit on the stack or the memory is intended to be used across function calls. Heap-based attacks flood the memory space reserved for a program or process. Heap-based vulnerabilities are difficult to exploit, so they are rarer.
it generally means that the buffer was allocated using a routine such as malloc()
a chunk of memory is allocated to the heap and data is written to this memory without any bound checking being done on the data
I tried to find the easiest example and I found the following:
#define BUFSIZE 256
int main(int argc, char **argv) {
char *buf;
buf = (char *)malloc(sizeof(char)*BUFSIZE);
strcpy(buf, argv[1]); }

Related

How does union prevent memory fragmentation?

I am going through this link and learning C. Interesting part on the page:
The real purpose of unions is to prevent memory fragmentation by arranging for a standard size for data in the memory. By having a standard data size we can guarantee that any hole left when dynamically allocated memory is freed will always be reusable by another instance of the same type of union.
I understand this part by the following code:
typedef struct{
char name[100];
int age;
int rollno;
}student;
typedef union{
student *studentPtr;
char *text;
}typeUnion;
int
main(int argc, char **argv)
{
typeUnion union1;
//use union1.studentPtr
union1.text="Welcome to StackOverflow";
//use union1.text
return 0;
}
Well, in the above code union1.text is reusing the space previously used by union1.studentPtr, not completely but still using.
Now, the part I don't understand is, when is the freed up space of malloc can't be used which leads to memory fragmentation?
edit: Going through the comments and answers, it is imperative to use the classic text, adding this edit to the post presuming it will help beginners like me
the comments have more expertise regarding unions in general.
Regarding your question specifically, this is my understanding:
union sets aside memory for the largest datatype in the union variable. So for example having a short int and a long int in the union will set aside enough memory for a long int
Imagine instead of union you declare a short int variable.
But then need to use a long int. So you use free on the short int
Then you use malloc to allocate memory for a long int. This has to be continguous memory. So now your memory looks like this.
With a free byte in the middle of an otherwise used block of memory. Sitting there waiting for you to request specifically 1 byte of memory.
Aside: If you're learning c I recommend the classic text. It's dated but I love the simplicity, clarity and text-book style approach.
That page is wrong, most programs simply assume memory usage will be fine and don't pay any attention to it.
In the following diagrams, assume each character represents, say 8 bytes. Letters represent different allocations, and underscores represent free memory. The scales involved are ludicrously small, and I'm skipping over details (like allocation metadata used by most malloc implementations), but the principles should be there.
Start with empty memory
_____________________________________________________
Then a bunch of 32-byte allocations occur as a program runs.
AAAABBBBCCCCDDDDEEEEFFFFGGGGHHHHIIIIJJJJKKKKLLLL________
The memory allocated by the program extends all the way past the last byte of the 'L' allocation, it's using 12*8*4 = 384 bytes.
Now the program frees every other allocation.
AAAA____CCCC____EEEE____GGGG____IIII____KKKK__________
Now the program is really only using 6*4*8 = 192 bytes, but the operating system has to keep all 352 bytes from the first 'A' to the last 'K' allocated for the program. Those freed gaps in between all the allocations are an example of memory fragmentation.
Now the program wants to allocate another 32-byte block. It could happen like this:
AAAAMMMMCCCC____EEEE____GGGG____IIII____KKKK_________
The new allocation fits in one of the gaps created by the frees, and this is fine since it recycles one of the gaps so we're wasting less space.
Now say the program needs to allocate a 40 byte block. None of the gaps is big enough, so the allocation has to go at the end and the operating system has to allocate more memory for the program, 352+40=392 bytes. All of the memory in those gaps is being wasted. This is the kind of waste due to memory fragmentation the webpage is talking about.
If all of the allocations had been 40 bytes to start with, then gap recycling could be maximized.

Why won't realloc work in this example?

My professor gave us an "assignment" to find why realloc() won't work in this specific example.
I tried searching this site and I think that it won't work because there is no real way to determine the size of a memory block allocated with malloc() so realloc() doesn't know the new size of the memory block that it needs to reallocate.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <windows.h>
int main ()
{
MEMORYSTATUS memInfo;
memInfo.dwLength = sizeof(MEMORYSTATUS);
GlobalMemoryStatus(&memInfo);
double slobodno = memInfo.dwAvailVirtual/1024./1024.;
printf("%g MB\n",slobodno);
int br=0,i,j;
char **imena,*ime,*temp,*bbb=NULL;
imena=(char**) malloc(sizeof(char*)*(br+1));
while(1)
{
printf("Unesite ime: ");
ime=(char*) malloc(sizeof(char)*4000000);
gets(ime);
printf("%u\n", strlen(ime));
ime=(char*) realloc(ime,strlen(ime)+1);
GlobalMemoryStatus(&memInfo);
slobodno = memInfo.dwAvailVirtual/1024./1024.;
printf("%g MB\n",slobodno);
if (strcmp(ime,".")==0)
{free(ime);free(imena[br]);break;}
imena[br++]=ime;
imena=(char**) realloc(imena,sizeof(char*)*(br+1));
}
for (i=0;i<br-1;i++)
for (j=i+1;j<br;j++)
if (strcmp(imena[i],imena[j])>0)
{
temp=imena[i];
imena[i]=imena[j];
imena[j]=temp;
}
//ovde ide sortiranje
for (i=0;i<br;i++)
printf("%s\n",imena[i]);
for(i=0;i<br;i++)
free(imena[i]);
free(imena);
return 0;
}
Note: Professor added the lines for printing out the available memory so we can see that realloc() doesn't work. Every new string we enter just takes up sizeof(char)+4000000 bytes and can't be reallocated. I'm trying to find out why. Thanks in advance
I have a feeling that it has something to do with the page sizes on Windows.
For example, if you change 4000000 to 400000, you can see that the memory can be re-used.
I think that allocating 4000000 forces Windows to use "huge" page sizes (of 4MB) and for some (unknown to me) reason, realloc doesn't work on them in the way that you would expect (i.e. making unused memory available for other allocations).
This seems to be related to Realloc() does not correctly free memory in Windows, which mentions VirutalAlloc, but I'm not sure it clarifies the exact reason that realloc doesn't work.
From MSDN:
The memblock argument points to the beginning of the memory block. If memblock is NULL, realloc behaves the same way as malloc and allocates a new block of size bytes.
So the line ime=(char*) realloc(NULL,sizeof(char)*4000000); just malloc's new memory each time.
realloc doesn't free memory. These functions work with a big block of memory (called a "heap") and carve chunks out when you call realloc/malloc/calloc. If you need more memory than is in the heap at the moment, then the heap is expanded by asking the operating system for more memory.
When you call realloc to make a memory block smaller, all that happens is that the memory you don't need any more is made available for *alloc to hand out again on a different request. Neither realloc nor free ever shrink the heap to return memory back to the operating system. (If you need that to happen, you need to call the operating system's native memory allocation procedures, such as VirtualAlloc on Windows.)
The problem is not that realloc doesn't know the size of the original block. Even though that information is not available for us programmers, it is required to be available to realloc (even if the block was allocated with malloc or calloc).
The line
ime=(char*) realloc(ime,strlen(ime)+1);
looks like it is shrinking the previously allocated block to fit the contents exactly, but there is no requirement that is actually shrinks the block of memory and makes the remainder available again for a new allocation.
Edit
Another thing I just thought of: The shrinking with realloc might work OK, but the memory is not returned by the runtime library to the OS because the library keeps it around for a next allocation.
Only, the next allocation is for such a large block that it does not fit the memory freed up with realloc.

How do I calculate beforehand how much memory calloc would allocate?

I basically have this piece of code.
char (* text)[1][80];
text = calloc(2821522,80);
The way I calculated it, that calloc should have allocated 215.265045 megabytes of RAM, however, the program in the end exceeded that number and allocated nearly 700mb of ram.
So it appears I cannot properly know how much memory that function will allocate.
How does one calculate that propery?
calloc (and malloc for that matter) is free to allocate as much space as it needs to satisfy the request.
So, no, you cannot tell in advance how much it will actually give you, you can only assume that it's given you the amount you asked for.
Having said that, 700M seems a little excessive so I'd be investigating whether the calloc was solely responsible for that by, for example, a program that only does the calloc and nothing more.
You might also want to investigate how you're measuring that memory usage.
For example, the following program:
#include <stdio.h>
#include <stdlib.h>
#include <malloc.h>
int main (void) {
char (* text)[1][80];
struct mallinfo mi;
mi = mallinfo(); printf ("%d\n", mi.uordblks);
text = calloc(2821522,80);
mi = mallinfo(); printf ("%d\n", mi.uordblks);
return 0;
}
outputs, on my system:
66144
225903256
meaning that the calloc has allocated 225,837,112 bytes which is only a smidgeon (115,352 bytes or 0.05%) above the requested 225,721,760.
Well it depends on the underlying implementation of malloc/calloc.
It generally works like this - there's this thing called the heap pointer which points to the top of the heap - the area from where dynamic memory gets allocated. When memory is first allocated, malloc internally requests x amount of memory from the kernel - i.e. the heap pointer increments by a certain amount to make that space available. That x may or may not be equal to the size of the memory block you requested (it might be larger to account for future mallocs). If it isn't, then you're given at least the amount of memory you requested(sometimes you're given more memory because of alignment issues). The rest is made part of an internal free list maintained by malloc. To sum it up malloc has some underlying data structures and a lot depends on how they are implemented.
My guess is that the x amount of memory was larger (for whatever reason) than you requested and hence malloc/calloc was holding on to the rest in its free list. Try allocating some more memory and see if the footprint increases.

Problem with free() on structs in C. It doesn't reduce memory usage

I'm having a problem with free() on a struct in my C program. When I look at /proc//statm before and after the free it doesn't seem to reduce. Am I using free() wrong in this case, or am I reading /proc//statm wrong?
Here is a test case which yields the problem:
struct mystruct {
unsigned int arr[10000];
};
void mem() {
char buf[30];
snprintf(buf, 30, "/proc/%u/statm", (unsigned)getpid());
FILE* pf = fopen(buf, "r");
if (pf) {
unsigned size; // total program size
unsigned resident;// resident set size
unsigned share;// shared pages
unsigned text;// text (code)
unsigned lib;// library
unsigned data;// data/stack
unsigned dt;// dirty pages (unused in Linux 2.6)
fscanf(pf, "%u %u %u %u %u %u", &size, &resident, &share, &text, &lib, &data);
printf("Memory usage: Data = %d\n", data*sysconf(_SC_PAGESIZE));
}
fclose(pf);
}
int main(int argc, char **argv) {
mem();
struct mystruct *foo = (struct mystruct *)malloc(sizeof(struct mystruct));
mem();
free(foo);
mem();
}
The output is:
Memory usage: Data = 278528
Memory usage: Data = 282624
Memory usage: Data = 282624
When I would expect it to be:
Memory usage: Data = 278528
Memory usage: Data = 282624
Memory usage: Data = 278528
I've done a similar test with malloc'ing a (char *), then free'ing it and it works fine. Is there something special about structs?
Your answer is right over here on Stack Overflow, but the short version is that, for very good reasons, the memory allocator does not return memory to the host OS but keeps it (internally in your program's data space) as a free list of some kind.
Some of the reasons the library keeps the memory are:
Interacting with the kernel is much slower than simply executing library code
The benefit would be small. Most programs have a steady-state or increasing memory footprint, so the time spent analyzing the heap looking for returnable memory would be completely wasted.
Internal fragmentation makes page-aligned blocks (the only thing that could be returned to the kernel) unlikely to exist, another reason not to slow the program down looking for something that won't be there.
Returning a page embedded in a free block would fragment the low and high parts of the block on either side of the page.
The few programs that do return large amounts of memory are likely to bypass malloc() and simply allocate and free pages anyway using mmap(2).
Whenever free actually releases the memory is implementation dependent. So maybe free is not returning the memory right away when it's a big chunk on memory. I don't think it has anything to do with structs.
Mainly for performance reasons, the allocated heap memory won't be returned to the OS after being freed. It will be marked as free though, and maybe later the kernel will get it back, or your program will allocate it and use it again.
I don't know what you used to alloc/free your (char *). The difference you've seen might be that your (char *) was allocated on the stack, and the release/free process it different than with the heap (stack memory management is a lot simpler).
It is up to the OS to really free your data and thus shrink the memory consumption of your program. You only tell it, that you won't use that memory anymore.

Heap size limitation in C

I have a doubt regarding heap in program execution layout diagram of a C program.
I know that all the dynamically allocated memory is allotted in heap which grows dynamically. But I would like to know what is the max heap size for a C program ??
I am just attaching a sample C program ... here I am trying to allocate 1GB memory to string and even doing the memset ...
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main(int argc, char *argv[])
{
char *temp;
mybuffer=malloc(1024*1024*1024*1);
temp = memset(mybuffer,0,(1024*1024*1024*1));
if( (mybuffer == temp) && (mybuffer != NULL))
printf("%x - %x\n", mybuffer, &mybuffer[((1024*1024*1024*1)-1)]]);
else
printf("Wrong\n");
sleep(20);
free(mybuffer);
return 0;
}
If I run above program in 3 instances at once then malloc should fail atleast in one instance [I feel so] ... but still malloc is successfull.
If it is successful can I know how the OS takes care of 3GB of dynamically allocated memory.
Your machine is very probably overcomitting on RAM, and not using the memory until you actually write it. Try writing to each block after allocating it, thus forcing the operating system to ensure there's real RAM mapped to the address malloc() returned.
From the linux malloc page,
BUGS
By default, Linux follows an optimistic memory allocation strategy.
This means that when malloc() returns non-NULL there is no guarantee
that the memory really is available. This is a really bad bug. In
case it turns out that the system is out of memory, one or more pro‐
cesses will be killed by the infamous OOM killer. In case Linux is
employed under circumstances where it would be less desirable to sud‐
denly lose some randomly picked processes, and moreover the kernel ver‐
sion is sufficiently recent, one can switch off this overcommitting
behavior using a command like:
# echo 2 > /proc/sys/vm/overcommit_memory
See also the kernel Documentation directory, files vm/overcommit-
accounting and sysctl/vm.txt.
You're mixing up physical memory and virtual memory.
http://apollo.lsc.vsc.edu/metadmin/references/sag/x1752.html
http://en.wikipedia.org/wiki/Virtual_memory
http://duartes.org/gustavo/blog/post/anatomy-of-a-program-in-memory
Malloc will allocate the memory but it does not write to any of it. So if the virtual memory is available then it will succeed. It is only when you write something to it will the real memory need to be paged to the page file.
Calloc if memory serves be correctly(!) write zeros to each byte of the allocated memory before returning so will need to allocate the pages there and then.

Resources