At what point does the segfault occur? - c

Does the following code segfault at array[10] = 22 or array[9999] = 22?
I'm just trying to figure out if the whole code would execute before it seg faults. (in the C language).
#include <stdio.h>
int main(){
int array[10];
int i;
for(i=0; i<9999; ++i){
array[i] = 22;
}
return 0;
}

It depends...
If the memory after array[9] is clean then nothing might happen, until ofcourse one reaches a segment of memory which is occupied.
Try out the code and add:
printf("%d\n",i);
in the loop and you will see when it crashes and burns.
I get various results, ranging from 596 to 2380.

Use a debugger?
$ gcc -g seg.c -o so_segfault
$ gdb so_segfault
GNU gdb 6.8-debian
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "i486-linux-gnu"...
(gdb) run
Starting program: /.../so_segfault
Program received signal SIGSEGV, Segmentation fault.
0x080483b1 in main () at seg.c:7
7 array[i] = 22;
(gdb) print i
$1 = 2406
(gdb)
In fact if you run this again, you will see that the segfault will not always occur for the same value of i. What is sure is that it happens when i>=10, but there is no way to determine the value for i for which it will crash, because this is not deterministic: It depends on how the memory is allocated. If the memory is free until array[222] (aka no other programs use it), it will go on until i=222, but it might as well crash for any other value of i>=10.

The answer is maybe. The C language says nothing about what should happen in this case. It is undefined behavior. The compiler is not required to detect the problem, do anything to handle the problem, terminate the program or anything else. And so it does nothing.
You write to memory that's not yours, and in practice one of three things may happen:
You might be lucky, and just get a segfault. This happens if you hit an address that is not allocated to your process. The OS will detect this, and throw an error at you.
You might hit memory that's genuinely unused, in which case no error will occur right away. But if the memory is allocated and used later, it will overwrite your data, and if you expect it to still be there by then, you'll get some nice delayed-action errors.
You might hit data that's actually used for something else already. You overwrite that, and sometime soon, when the original data is needed, it'll read your data instead, and unpredictable errors will ensue.
Writing out of bounds: Just don't do it. The C language won't do anything to tell you when it happens, so you have to keep an eye on it yourself.

When and if your code crashes is not deterministic. It'll depend on what platform you're running the code on.
array is a stack variable, so your compiler is going to reserve 10 * sizeof(int) bytes on the stack for it. Depending on how the compiler arranges other local variables and which way your stack grows, i may come right after array. If you follow Daniel's suggestion and put the printf statement in, you may notice an interesting effect. On my platform, when i = 10, array[10] = 22 clobbers i and the next assignment is to array[23].
A segmentation violation occurs when user code tries to touch a page that it does not have access to. In this case, you'll get one if your stack is small enough that 9999 iterations runs out off the stack.
If you had allocated array on the heap instead (by using malloc()), then you'll get a SIGSEGV when you run off the end of a page boundary. Even a 10 byte allocation will return a whole page. Page sizes vary by platform. Note that some malloc debuggers can try to flag an array out-of-bounds case, but you won't get the SIGSEGV unless the hardware gets involved when you run off the end of the page.

Where your code will segfault depends on what compiler you're using, luck, and other linking details of the program in question. You will most likely not segfault for i == 10. Even though that is outside your array, you will almost certainly still have memory allocated to your process at that location. As you keep going beyond your array bounds, however, you will eventually leave the memory allocated to your process and then take a segfault.
However, if you write beyond an array boundary, you will likely overwrite other automatic variables in your same stack frame. If any of these variables are pointers (or array indexes used later), then when you reference these now-corrupted values, you'll possibly take a segfault. (Depending on the exact value corrupted to and whether or not you're now going to reference memory that is not allocated to your process.)
This is not terribly deterministic.

segmentation fault happens when accessing outside of the processes dedicated memory,
this is not easily predicted. When i == 10 it's outside the array.. but might still be in the memory of the process. This depends how the processes memory got allocated, something there is no way (normally) of knowing (depending of the memory manager of the OS). So segfault might happen at any of i = 10 - 9999, or not at all.

I suggest using GDB to investigate such problems: http://www.gnu.org/software/gdb/documentation/

More generally, you can figure out where the segmentation fault occurs in Linux systems by using a debugger. For example, to use gdb, compile your program with debugging symbols by using the -g flag, e.g.:
gcc -g segfault.c -o segfault
Then, invoke gdb with your program and any arguments using the --args flag, .g.:
gdb --args ./segault myarg1 myarg2 ...
Then, when the debugger starts up, type run, and your program should run until it receives SIGSEGV, and should tell you where it was in the source code when it received that signal.

Related

Opposite of a Heisenbug, segmentation fault that appear ONLY when debugging, but program works fine

I've a written a simple C program with two data structures implemented as ADT, so I dynamically allocate the memory for them
Everything was working fine, until I've decided to add a int value inside a struct, nothing dynamically allocated, classic plain old simple static memory allocation, but since I've added it I've started having a segfault in a pretty safe function that shouldn't segfault at all.
I've thought about a memory allocation error, so I've tried to not free and reuse a pointer variable I was using, but instead use another variable, and doing so the program went fine.
Pissed off by all the times I had to deal with this kind of errors, I've re-enabled that free I was talking before, recompiled and made a run with valgrind.
To my surprise, there was absolutely no memory leak, no segmentation fault, not any kind of interruption, just a warn about Conditional jump or move depends on uninitialised value(s), but that's a wanted behavior (if (pointer == NULL) { }) so I've run the executable directly from command line and again, everything went fine, so the situation it's this:
Program without the new int value in the struct:
Compile : check
Runs : check
Valgrind analisys: No memory leakage, just the warn
Debug (gdb) : check
Program with the new int value in the struct:
Compile : check
Runs : check
Valgrind analisys: No memory leakage, just the warn
Debug (gdb) : Segfault
So I think that's the opposite of a Heisenbug, a bug that shows itself only and absolutely only when debugging, how can I try to fix this?
OK thanks to #weather-vane and #some-programmer-dude I've noticed that effectively I wasn't initializing the variable valgrind was complaining about, and I've misunderstood the valgrind warn, I was reading it as You should not use a if to check if variables are NULL

Segmentation fault with ulimit set correctly

I tried to help an OP on this question.
I found out that a code like the one below causes segmentation fault randomly even if the stack is set to 2000 Kbytes.
int main ()
{
int a[510000];
a[509999] = 1;
printf("%d", a[509999]);
return 0;
}
As you can see the array is 510000 x 4 bytes = 2040000 bytes.
The stack is set to 2000 Kbytes (2048000 bytes) using ulimit command:
ulimit -s 2000
ulimit -Ss 2000
Based on those numbers the application has room to store the array, but randomly it return segmentation fault.
Any ideas?
There's a few reasons why you can't do this. There are things that are already using parts of your stack.
main is not the first thing on your stack. There are functions called by the real entry point, dynamic linker, etc. that are before main and they are all probably using some of the stack.
Additionally, there can be things that are usually put on the top of the stack to set up execution. Many systems I know put all the strings in argv and all environment variables on top of the stack (which is why main is not the entry point, there's usually code that runs before main that sets up environment variables and argv for main).
And to top it off a part of the stack can be deliberately wasted to increase the randomness of ASLR if your system does that.
Run you program in the debugger, add a breakpoint at main, look up the value of the stack register and examine the memory above it (remember that most likely your stack grows down unless you're on a weird architecture). I bet you'll find lots of pointers and strings there. I just did this on a linux system and as I suspected all my environment variables were there.
The purpose of resource limits (ulimit) on Unix has never really been to micromanage things down to a byte/microsecond, they are there just to stop your program from going completely crazy and taking down the whole system with it. See them not as red lights and stop signs on a proper road, see them as run-off areas and crash barriers on a racetrack.
If you still wants to access the int location in the array, try to compile the code with out the main..this will not invoke _start
check this discussion enter link description here

Can you access other program's memory using program compiled with mingw?

I wrote this very simple program on Windows 8.1 and compiled it using gcc from Mingw. I ran it with "test.exe > t.txt" and "test.exe > t1.txt" and the outputs were different (even though it uses virtual addresses). It ran for a while and then it crashed. I decided to test this because I'm reading a book on operating systems.
Is it reading other programs' memory? Wasn't that not supposed to happen? I'm probably misunderstanding something...
#include <stdio.h>
int main(int argc, char *argv[]){
int r = 0;
int p[4] = {1,5,4,3};
for(r=0; p[r]!=1111111111111111; r++){
p[2] = p[r];
printf("%d\n", p[2]);
}
return 0;
}
Thank you.
SadSeven, I assume you are reading past the end of the array on purpose. What you are seeing is not other programs memory, it's uninitialized memory inside of your programs memory.
Every program runs inside it's own virtual memory space, the os's virtual memory manager takes care of this. You can't access another programs memory from your program (unless you are both using shared memory, but you have to do that on purpose)
You haven't initialized anything beyond p[3]. The C language makes no guarantees about what will happen when you try to access addresses that haven't been initiazed with data. You'll likely see a bunch of garbage, but what the garbage is isn't defined by the program you wrote. It could be anything.
The addresses you are accessing before the crash still belong to the current process, it is just unitialized memory that exists between the stack and heap.
The process probably crashed due to a segmentation fault, which occurs when a process tries to access memory that doesn't belong to it. This would be the point when it attempts to access outside its own memory.
The output you see is from reading its own memory. When it reaches memory that isn't assigned to the process it crashes.
Edit:
To make things harder for computer viruses, the starting address of a program will be different each time you run it. So you should expect different output if you run it several times. In Windows, the adress space layout is not randomized by all programs.
Your program overruns a local (auto) variable, which means that it will walk up through the stack frame(s). The stack frame contains local variables, function arguments, saved registers, the return address of the function call, and a pointer to the end of the previous stack frame. If the variables all have the same values any difference would be explained by memory addresses being different. There may be other reasons that I'm not aware of, as I'm not an expert on the memory layout in Windows.
In your code, the for loop is erroneous.
for(r=0; p[r]!=1111111111111111; r++)
it tries to access out of bound memory starting from a value of 4 of r. The result is undefined behaviour.
It may run fine till r is 1997, it may crash at r value 4, it may start playing a song from your playlist at r value 2015, even. The behaviour is not guaranteed after r value 3.
Each process runs within its separate 4GB virtual address space, attempting an out-of-bounds read won't read from another process's memory. It will read garbage from its own address space. Now, you have been asking about why the output is different, well, ASLR randomises key parts of an executable thereby giving different entry points and stack addresses at instance of a process, thereby even the same process which is run more than once will have different entry points
Read about ASLR at: http://en.wikipedia.org/wiki/Address_space_layout_randomization
Read about Virtual Memory at:
http://en.wikipedia.org/wiki/Virtual_memory

Finding when data at a particular memory location was last modified? (C, gcc, gdb, valgrid, clang?)

I have a nasty segfault that's been plaguing me for a while. It had something to do with migration of code from 32-bit to 64-bit, but it's an occasional fault and hard to track down.
I wanted to know -- is there any tool (pref. Linux, FOSS) that I can use to trace back from a segfault, to find where in my code an illegal (out-of-bounds) pointer was assigned its value?
For example, if I get a segfault by trying to read the int value pointed to by my variable int *a (the value of which has been assigned somewhere else somewhere far away in my code), how can I find where in the code that value was assigned?
It seems like the sort of thing one might be able to do with clang/llvm, but I don't really know where to look. I guess such a think can't really be done with gdb or valgrind, because IFAIK they don't have a way to store the required information during program execution.
Any suggestions anyone has got would be much appreciated!
Edit: after much digging, I found the error I had been looking for. Basically, a 'unsigned long *' was being cast to 'int *' in such a way that warnings were suppressed, somehow (http://ascend4.org/b564). The question still stands, though, because my bug-search was very manual and tedious: if I have a variable in my program, how I can trace back to find what sequence/chain/tree of statements caused it to take its current value? Is there any tool that automates this? This includes passing of parameters to functions, assignment statements (including assignment via dereferenced pointer), etc.
A memory breakpoint (watchpoint in the GDB docs) sounds like the way to go. Compile using -g for debugging symbols and then place a memory-write breakpoint like this:
print &a
watch *0xdeadbeef
If you want to include the reads too, you can use awatch. Check the GDB docs for more information.
That way you should be able to trace the last write before the segmentation fault occurs.
The tool you should be using for problems like this is valgrind. For example, try the following code with Valgrind:
char *str = malloc(10);
str[10] = '\0';
It prints:
==14272== Invalid write of size 1
==14272== at 0x80483E4: main (in /path/to/a.out)
==14272== Address 0x4025032 is 0 bytes after a block of size 10 alloc'd
==14272== at 0x4005BDC: malloc (vg_replace_malloc.c:195)
==14272== by 0x80483D8: main (in /path/to/a.out)
However, if valgrind is not working for you, an option would be to replace malloc with mmap in a way that leaves either the bytes before the beginning of the allocated block or the bytes after the end of the allocated block unmapped. Because block sizes in general are not multiples of page sizes, you can pick only one of the options, not both. But you can run your problem with the "leave beginning unmapped" and "leave end unmapped" strategies separately to catch both kinds of errors.
The code for how to replace malloc with mmap is too long for this answer, unfortunately.

Strcpy a bigger string to a smaller array of char

Why when I do this:
char teststrcpy[5];
strcpy(teststrcpy,"thisisahugestring");
I get this message in run time:
Abort trap: 6
Shouldn't it just overwrite what is in the right of the memory of teststrcpy? If not, what does Abort trap means?
I'm using the GCC compiler under MAC OSX
As a note, and in answer to some comments, I am doing this for playing around C, I'm not going to try to do this in production. Don't you worry folkz! :)
Thanks
I don't own one, but I've read that Mac OS treats overflow differently, it won't allow you to overwrite memory incertian instances. strcpy() being one of them
On Linux machine, this code successfully overwrite next stack, but prevented on mac os (Abort trap) due to a stack canary.
You might be able to get around that with the gcc option -fno-stack-protector
Ok, since you're seeing an abort from __strcpy_chk that would mean it's specifically checking strcpy (and probably friends). So in theory you could do the following*:
char teststrcpy[5];
gets(teststrcpy);
Then enter your really long string and it should behave baddly as you wish.
*I am only advising gets in this specific instance in an attempt to get around the OS's protection mechanisms that are in place. Under NO other instances would I suggest anyone use the code. gets is not safe.
Shouldn't it just overwrite what is in the right of the memory of teststrcpy?
Not necessarily, it's undefined behaviour to write outside the allocated memory. In your case, something detected the out-of-bounds write and aborted the programme.
In C there is nobody who tells you that "buffer is too small" if you insist on copying too many characters to a buffer that is too small you will go into undefined behavior terrority
If you would LIKE to overwrite what's after 5th char of teststrcpy, you are a scary man. You can copy a string of size 4 to your teststrcpy (fifth char SHOLULD be reserved for NULL).
Most likely your compiler is using a canary for buffer overflow protection and, thus, raising this exception when there is an overflow, preventing you from writing outside the buffer.
See http://en.wikipedia.org/wiki/Buffer_overflow_protection#Canaries

Resources