Is there any way to avoid static memory area overflow? - c

The problem is very obvious so I will just show you some code:)
#include <stdio.h>
#include <string.h>
char *test1()
{
static char c;
char *p;
p = &c;
printf("p[%08x] : %s\n", (unsigned int)p, p);
return p;
}
void *test2()
{
static char i;
char *buf;
int counter = 0;
for(buf = (char *)&i ; ;counter += 8)
{
memset(buf + counter, 0xff, 8);
printf("write %d bytes to static area!\n", counter);
}
}
int main()
{
char *p;
p = test1();
strcpy(p, "lol i asd");
p = test1();
strcpy(p, "sunus again!");
p = test1();
strcpy(p, "sunus again! i am hacking this!!asdfffffffffffffffffffffffffffffffffffffffffffffffffffffffffff");
p = test1();
test2();
return 0;
}
First I wrote test1().
As you can see, those strcpys should cause a segment fault, because it's obviously accessing an illegal memory area. I knew some basic about static variable, but this is just strange for me.
Then I wrote test2().
finally, it caused a segment fault, after it wrote almost 4k bytes.
So, I wonder how to avoid this kind of error (static variable overflow) from happening?
Why can I access those static memory areas?
I know that they aren't in the stack, nor heap.
PS. Maybe I am not describing my problem clearly.
I have got some years of C programming experience; I know what will happen when this is not static.
Now static changes almost everything and I want to know why.

Undefined behaviour is just that - undefined. It might look like it's working, it might crash, it might steal your lunch money. Just don't do it.

Segmentation fault occurs when you exceed memory page, which is 4kB, so with luck you can write full 4k before it happens, and if next page is already utilized - even that isn't guaranteed. Gcc's stack protector could help sometimes, but not in that case. Valgrind could help too. None of these are guaranteed. You better take care of it yourself.

You cannot avoid the potential for overwriting memory that you did not allocate in C.
As to the manner of the failure... When, where and how your application crashes or otherwise misbehaves depends entirely on your compiler, the compiler flags, and the random state memory was in before your program started executing. You are accessing memory you are not supposed to, and it's completely undefined in C what impact that access will have on the operating environment.
Languages that manage memory (e.g. Java, C#) do that for you, but of course there is a cost to the bounds checking.
You can certainly use memory management libraries (replacements for malloc/free/new/delete) that will attempt to detect improper memory management.

You avoid the problem by knowing how big your memory areas are and by not writing outside the boundaries of those areas. There's no other reliable way to deal with the issue.

Related

What causes this bug to be non-deterministic

Recently, I wrote the following, buggy, c code:
#include <stdio.h>
struct IpAddr {
unsigned char a, b, c, d;
};
struct IpAddr ipv4_from_str (const char * str) {
struct IpAddr res = {0};
sscanf(str, "%d.%d.%d.%d", &res.a, &res.b, &res.c, &res.d);
return res;
}
int main (int argc, char ** argv) {
struct IpAddr ip = ipv4_from_str("192.168.1.1");
printf("Read in ip: %d.%d.%d.%d\n", ip.a, ip.b, ip.c, ip.d);
return 0;
}
The bug is that I use %d in sscanf, while supplying pointers to the 1-byte-wide unsigned char. %d accepts a 4-byte wide int pointer, and the difference results in an out-of-bounds write. out-of-bound write is definitely ub, and the program will crash.
My confusion is in the non-constant nature of the bug. Over 1,000 runs, the program segfaults before the print statement 50% of the time, and segfaults after the statement the other 50% of the time. I don't understand why this can change. What is the difference between two invocations of the program? I was under the impression that the memory layout of the stack is consistent, and small test programs I've written seem to confirm this. Is that not the case?
I'm using gcc v11.3.0 on Debian bookworm, kernel 5.14.16-1, and I compiled without any flags set.
Here is the assembly output from my compiler for reference.
Undefined behavior means that anything can happen, even inconsistent results.
In practice, this inconsistency is most likely due to Address Space Layout Randomization. Depending on how the data is located in memory, the out-of-bounds accesses may or may not access unallocated memory or overwrite a critical pointer.
See also Why don't I get a segmentation fault when I write beyond the end of an array?

Strings and Dynamic allocation in C [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Undefined, unspecified and implementation-defined behavior
This should seg fault. Why doesn't it.
#include <string.h>
#include <stdio.h>
char str1[] = "Sample string. Sample string. Sample string. Sample string. Sample string. ";
char str2[2];
int main ()
{
strcpy (str2,str1);
printf("%s\n", str2);
return 0;
}
I am using gcc version 4.4.3 with the following command:
gcc -std=c99 testString.c -o test
I also tried setting optimisation to o (-O0).
This should seg fault
There's no reason it "should" segfault. The behaviour of the code is undefined. This does not mean it necessarily has to crash.
A segmentation fault only occurs when you perform an access to memory the operating system knows you're not supposed to.
So, what's likely going on, is that the OS allocates memory in pages (which are typically around 4KiB). str2 is probably on the same page as str1, and you're not running off the end of the page, so the OS doesn't notice.
That's the thing about undefined behavior. Anything can happen. Right now, that program actually "works" on your machine. Tomorrow, str2 may be put at the end of a page, and then segfault. Or possibly, you'll overwrite something else in memory, and have completely unpredictable results.
edit: how to cause a segfault:
Two ways. One is still undefined behavior, the other is not.
int main() {
*((volatile char *)0) = 42; /* undefined behavior, but normally segfaults */
}
Or to do it in a defined way:
#include <signal.h>
int main() {
raise(SIGSEGV); /* segfault using defined behavior */
}
edit: third and fourth way to segfault
Here is a variation of the first method using strcpy:
#include <string.h>
const char src[] = "hello, world";
int main() {
strcpy(0, src); /* undefined */
}
And this variant only crashes for me with -O0:
#include <string.h>
const char src[] = "hello, world";
int main() {
char too_short[1];
strcpy(too_short, src); /* smashes stack; undefined */
}
Your program writes beyond the allocated bounds of the array, this results in Undefined Behavior.
The program is ill-formed and It might crash or may not.An explanation may or may not be possible.
It probably doesn't crash because it overwrites some memory beyond the array bounds which is not being used, bt it will once the rightful owner of that memory tries to access it.
A seg-fault is NOT guaranteed behavior.It is one possible (and sometimes likely) outcome of doing something bad.Another possible outcome is that it works by pure luck.A third possible outcome is nasal demons.
if you really want to find out what this might be corrupting i would suggest you see what follows the over-written memory generate a linker map file that should give you a fair idea but then again this all depends on how things are layed out in memory, even can try running this with gdb to reason why it does or does not segfault, that being said, the granularity for built checks in access violations (HW assisted) cannot be finer than a page unless some software magic is thrown in (even with this page granularity access checking it may happen that the immediately next page does really point to something else for the program which you are executing and that it is a Writable page), someone who knows about valgrind can explain how it is able to detect such access violations (also libefence), most likely (i might be very wrong with this explanation, Correct me if i am wrong!) it uses some form of markers or comparisons for checking if out of bounds access has happened.

the function of malloc(using malloc correctly)

so I'm quite new in this, sorry if it sound like a dumb question
I'm trying to understand malloc, and create a very simple program which will print "ABC" using ASCII code
here is my code (what our professor taught us) so far
char *i;
i = malloc(sizeof(char)*4);
*i = 65;
*(i+1) = 66;
*(i+2) = 67;
*(i+3) = '\0';
what I don't understand is, why do I have to put malloc there?
the professor told us the program won't run without the malloc,
but when I tried and run it without the malloc, the program run just fine.
so what's the function of malloc there?
am I even using it right?
any help and or explanation would be really appreciated
the professor told us the program won't run without the malloc
This is not quite true, the correct wording would be: "The program's behavior is undefined without malloc()".
The reason for this is that
char *i;
just declares a pointer to a char, but there's no initialization -- this pointer points to some indeterminate location. You could be just lucky in that writing values to this "random" location works and won't result in a crash. I'd personally call it unlucky because this hides a bug in your program. undefined behavior just means anything can happen, including a "correct" program execution.
malloc() will dynamically request some usable memory and return a pointer to that memory, so after the malloc(), you know i points to 4 bytes of memory you can use. If malloc() fails for some reason (no more memory available), it returns NULL -- your program should test for it before writing to *i.
All that said, of course the program CAN work without malloc(). You could just write
char i[4];
and i would be a local variable with room for 4 characters.
Final side note: sizeof(char) is defined to be 1, so you can just write i = malloc(4);.
Unfortunately, "runs fine" criterion proves nothing about a C program. Great deal of C programs that run to completion have undefined behavior, which does not happen to manifest itself on your particular platform.
You need special tools to see this error. For example, you can run your code through valgrind, and see it access uninitialized pointer.
As for the malloc, you do not have to use dynamic buffer in your code. It would be perfectly fine to allocate the buffer in automatic memory, like this:
char buf[4], *i = buf;
You have to allocate space for memory. In the example below, I did not allocate for memory for i, which resulted in a segmentation fault (you are trying to access memory that you don't have access to)
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void)
{
char *i;
strcpy(i, "hello");
printf("%s\n", i);
return (0);
}
Output: Segmentation fault (core dumped)
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void)
{
char *i;
/*Allocated 6 spots with sizeof char +1 for \0 character*/
i = malloc(sizeof(char) * 6);
strcpy(i, "hello");
printf("%s\n", i);
return (0);
}
Result: hello
Malloc allows you to create space, so you can write to a spot in memory. In the first example, "It won't work without malloc" because i is pointing to a spot in memory that doesn't have space allocated yet.

Strange behaviour of free() after memory allocation violation

Not so long ago I was hunting a bug in some big number library I was writing, it costed me quite a while. The problem was that I violated the memory bounds of some structure member, but instead of a segmentation fault or just a plain crash, it did something unexpected (at least I did not expect it). Let me introduce a example:
segmentation_fault.c
#include <stdlib.h>
#include <stdio.h>
#include <errno.h>
#include <signal.h>
#define N 100 /* arbitrary large number */
typedef unsigned char byte;
void exitError(char *);
void segmentationFaultSignalHandler(int);
sig_atomic_t segmentationFaultFlag = 0;
int main(void)
{
int i, memorySize = 0;
byte *memory;
if (setvbuf(stdout, NULL, _IONBF, 0))
exitError("setvbuf() failed");
if (signal(SIGSEGV, segmentationFaultSignalHandler) == SIG_ERR)
exitError("signal() failed");
for (i = 0; i < N; ++i)
{
printf("Before malloc()\n");
if ((memory = malloc(++memorySize * sizeof(byte))) == NULL)
exitError("allocation failed");
printf("After malloc()\n");
printf("Before segmentation fault\n");
memory[memorySize] = 0x0D; /* segmentation fault */
if (segmentationFaultFlag)
exitError("detected segmentation fault");
printf("After segmentation fault\n");
printf("Before free()\n");
free(memory);
printf("After free()\n");
}
return 0;
}
void segmentationFaultSignalHandler(int signal)
{
segmentationFaultFlag = 1;
}
void exitError(char *errorMessage)
{
printf("ERROR: %s, errno=%d.\n", errorMessage, errno);
exit(1);
}
As we can see the line memory[memorySize] = 0x0D; is clearly violating the memory bounds given by malloc(), but it does not crash or raise a signal (I know according to ISO C99 / ISO C11 the signal handling is implementation defined and does not have to raise at all when violating memory bounds). It moves on printing the lines After segmentation fault, Before free() and After free(), but after a couple of iterations later it crashes, always at free() (printing After segmentation fault and Before free(), but not After free()). I was wondering what causes this behavior and what is the best way to detect memory access violations (I'm ashamed, but I always kinda used printfs to determine where a program crashed, but sure there must be better tools to do that) as it is very hard to detect (most often it does not crash at the code of violation, but, as in the example, later in the code, when trying to do something with this memory again). Surely I should be able to free this memory as I allocated it right and did not modify the pointer.
You only can detect violations in an faked enviroment.
In the case, you violate the Memory you gained from the system, you can't belive anything anymore. As all what happens now is undefined behaving and you CAN'T expect what will happen, as there isn't any rule.
So if you want to check a program for memory leacks or some read/write violation. you have to write a program which gets a memory area which belongs to it and you give a part of the area to the "to be checked" program. You have to inspect the process and keep track on where it is writing and reading into our memory and you have to use the other part of the memory to check for was it allowed to read write there or not (i.e. In your faked enviroment by setting some FLAGS and check they got changed or not).
Because if the program is leaving the area you own. you can't be sure about you will detect this behaving or not.
So You have to make your own memory managemend to check such behaving.
When reading or writing in memory you don't own you get undefined behavior.
This doesn't always result in segmentation fault. In practise it is much more likely that the code will corrupt some other data and your program will crash at some other place which makes it hard to debug.
In this example you wrote to an invalid heap address. It's likely that you will corrupt some internal heap structures which makes it likely that the program will crash on any following malloc or free calls.
There are tools that check your heap usage and can tell you if you write out of your bounds. I like and would recommend valgrind for linux and gflags for windows.
When malloc is returning a pointer to a chunk of memory, it uses some additional information about this pointer (like the size of allocated space). This information is usually stored on addresses right before the returned pointer. Also very often, malloc can return a pointer to bigger chunk than you asked for. In consequence addresses before and after the pointer are valid. You can write there without provoking segmentation fault or other system error. However, if you write there you may overwrite the data malloc needs for correct freeing of the memory. The behavior of subsequent calls of malloc and free is undefined since this point.

Read and write to a memory location

After doing lot of research in Google, I found this program:
#include <stdio.h>
int main()
{
int val;
char *a = (char*) 0x1000;
*a = 20;
val = *a;
printf("%d", val);
}
But it is throwing a run time error, at *a = 20.
So how can I write and read a specific memory location?
You are doing it except on your system you cannot write to this memory causing a segmentation fault.
A segmentation fault (often shortened to segfault), bus error or access violation is generally an attempt to access memory that the CPU cannot physically address. It occurs when the hardware notifies an operating system about a memory access violation. The OS kernel then sends a signal to the process which caused the exception. By default, the process receiving the signal dumps core and terminates. The default signal handler can also be overridden to customize how the signal is handled.
If you are interested in knowing more look up MMU on wikipedia.
Here is how to legally request memory from the heap. The malloc() function takes a number of bytes to allocate as a parameter. Please note that every malloc() should be matched by a free() call to that same memory after you are done using it. The free() call should normally be in the same function as where you called malloc().
#include <stdio.h>
int main()
{
int val;
char *a;
a = (char*)malloc(sizeof(char) * 1);
*a = 20;
val = (int)*a;
printf("%d", val);
free(a);
return 0;
}
You can also allocate memory on the stack in a very simple way like so:
#include <stdio.h>
int main()
{
int val;
char *a;
char b;
a = &b;
*a = 20;
val = (int)*a;
printf("%d", val);
return 0;
}
This is throwing a segment violation (SEGFAULT), as it should, as you don't know what is put in that address. Most likely, that is kernel space, and the hosting environment doesn't want you willy-nilly writing to another application's memory. You should only ever write to memory that you KNOW your program has access to, or you will have inexplicable crashes at runtime.
If you are running your code in user space(which you are), then all the addresses you get are virtual addresses and not physical addresses. You cannot just assume and write to any virtual address.
In fact with virtual memory model you cannot just assume any address to be a valid address.It is up to the memory manager to return valid addresses to the compiler implementation which the handles it to your user program and not the other way round.
In order that your program be able to write to an address:
It should be a valid virtual address
It should accessible to the address space of your program
You can't just write at any random address. You can only modify the contents of memory where your program can write.
If you need to modify contents of some variable, thats why pointers are for.
char a = 'x';
char* ptr = &a; // stored at some 0x....
*ptr = 'b'; // you just wrote at 0x....
Issue of permission, the OS will protect memory space from random access.
You didn't specify the exact error, but my guess is you are getting a "segmentation fault" which would clearly indicate a memory access violation.
You can write to a specific memory location.
But only when you have the rights to write to that location.
What you are getting, is the operating system forbidding you to write to 0x1000.
That is the correct way to write data to memory location 0x1000. But in this day and age of virtual memory, you almost never know the actual memory location you want to write to in advance, so this type of thing is never used.
If you can tell us the actual problem you're trying to solve with this, maybe we can help.
You cannot randomly pick a memory location and write to. The memory location must be allocated to you and must be writable.
Generally speaking, you can get the reference/address of a variable with & and write data on it. Or you can use malloc() to ask for space on heap to write data to.
This answer only covers how to write and read data on memory. But I don't cover how to do it in the correct way, so that the program functions normally. Other answer probably covers this better than mine.
First you need to be sure about memory location where you want to write into.
Then, check if you have enough permission to write or not.

Resources