This question already has answers here:
Accessing an array out of bounds gives no error, why?
(18 answers)
Closed 9 years ago.
I have a program which I expect it to crash but it doesn't. Can you please let me know the reason.
char a[5];
strncpy(a,"abcdefg",7);
a[7] = '\0';
printf("%s\n",a);
Shouldn't the program crash at strncpy() or at a[7]='\0' which is greater than array size of 5. I get output as abcedefg. I'm using gcc compiler.
Size of a array is five char a[5]; and your are assigning at 7th location that is buffer overrun problem and behavior of your code is Undefined at run time.
strncpy(a,"abcdefg",7);
a[7] = '\0';
Both are wrong, you need to defined array like:
#defined size 9 // greater then > 7
char a[size];
notice "abcdefg" need 8 char one extra for \0 null char.
read: a string ends with a null character, literally a '\0' character
In your example, your program has access to memory beyond a (starting address of array) plus 5 as the stack of the program may be higher. Hence, though the code works, ideally it is undefined behavior.
C often assumes you know what your doing, even (especially) when you've done something wrong. There is no bounds to an array, and you'll only get an error if your lucky and you've entered into an undefined memory location and get a segmentation fault. Otherwise you'll be able to access change memory, to whatever results.
You can't give a definition to undefined behaviour, as you are attempting by stating that it should crash. Another example of undefined behaviour that doesn't commonly crash is int x = INT_MAX + 1;, and int x = 0; x = x++ + ++x;. These might work on your system, if only by coincidence. That doesn't stop them from wreaking havoc on other systems!
Consider "Colourless, green ideas sleep furiously", or "The typewriter passed the elephant to the blackness". Do either of these statements make any sense in English? How would you interpret them? This is a similar situation to how C implementations might treat undefined behaviour.
Let us consider what might happen if you ask me to put 42 eggs in my carton that can store at least 12 eggs. The container most certainly has bounds, but you insist that they can all fit in there. I find that the container can only store 12 eggs. You won't know what happens to the 30 remaining eggs, so the behaviour is undefined.
Related
int main ()
{
/*
char a[] = "abc";
printf("strlen(a): %li", strlen(a));
printf("\nsizeof(a): %li", sizeof(a));
*/
char b[3];
printf("\nstrlen(b): %li", strlen(b));
printf("\nsizeof(b): %li", sizeof(b));
printf("\nb = ");
puts(b);
return 0;
}
When I run the above code it outputs the following:
strlen(b): 1
sizeof(b): 3
b =
but if I undo the comment, it outputs:
strlen(a): 3
sizeof(a): 4
strlen(b): 6
sizeof(b): 3
b = ���abc
Why does this happens? I would appreciate a good in depth explanation about it principally and if possible a quick "fix" for it so I don't get this problem again.
I'm relatively a beginner in programming and C in general and based on what I learned until now, this shouldn't happen
thanks and sorry if I broke any rule from this website, I'm new here too!
strlen(b) causes undefined behavior because the array b is not initialized. The contents of the array are therefore indeterminate. strlen may return a small number if there happens to be a null byte in the garbage contents of the array (acting as a null terminator), or a large number if there is no null byte in the array but there is one in memory adjacent to it (that happens not to crash when accessed), or it may segfault, or fail in some other unpredictable way. The particular misbehavior you observe can easily depend on the contents of other nearby memory and therefore be influenced by adding or removing other variables, or altering surrounding code in apparently unrelated ways.
puts(b) is similarly undefined behavior.
(Another bug: sizeof and strlen both return size_t, for which the correct printf format specifier is %zu, not %li which would be for long int.)
I would appreciate a good in depth explanation about it principally and if possible a quick "fix" for it so I don't get this problem again.
Do not attempt to read or use the contents of local variables that have not been initialized.
See also What happens to a declared, uninitialized variable in C? Does it have a value? and (Why) is using an uninitialized variable undefined behavior?.
If you enable compiler warnings, your compiler can warn you about some instances of this, e.g. gcc catches this example. Tools like valgrind can help too.
I'm relatively a beginner in programming and C in general and based on what I learned until now, this shouldn't happen
On the contrary, such behavior is extremely common in C. The C language does not guarantee any checks for bugs like this, and implementations generally don't provide them. You should get used to the possibility that the language will not stop you from doing something erroneous, and will instead misbehave in unpredictable ways (or worse, appear to work just fine for a while). As a result, when programming in C, you have to be much more careful and attentive to the language rules than when working with "safer" languages. It's a tough and unfriendly language for beginners.
I have to show an error when I access an item outside of an array (without creating my own function for it). So I just thought it was necessary to access the value out of the array to trigger a segfault but this code does not crash at all:
int main(){
int tab[4];
printf("%d", tab[7]);
}
Why I can't get an error when I'm doing this?
When you invoke undefined behavior, anything can happen. You program may crash, it may display strange results, or it may appear to work properly.
Also, making a seemingly unrelated change such as adding an unused local variable or a simple call to printf can change the way in which undefined behavior manifests itself.
When I ran this program, it completed and printed 63. If I changed the referenced index from 7 to 7000, I get a segfault.
In short, just because the program can crash doesn't mean it will.
Because the behavior when you do things not allowed by the spec is "undefined". And because there are no bounds checks required in C. You got "lucky".
int tab[4]; says to allocate memory for 4 integers on the stack. tab is just a number of a memory address. It doesn't know anything about what it's pointing at or how much space as been allocated.
printf("%d", tab[7]); says to print out the 8th element of tab. So the compiler does...
tab is set to 1000 (for example) meaning memory address 1000.
tab represents a list of int, so each element will be sizeof(int), probably 4 or 8 bytes. Let's say 8.
Therefore tab[7] means to start reading at memory position (7 * 8) + 1000 = 1056 and for 8 more bytes. So it reads 1056 to 1063.
That's it. No bounds checks by the program itself. The hardware or OS might do a bounds check to prevent one process from reading arbitrary memory, have a look into protected memory, but nothing required by C.
So tab[7] faithfully reproduces whatever garbage is in 1056 to 1063.
You can write a little program to see this.
int main(){
int tab[4];
printf("sizeof(int): %zu\n", sizeof(int));
printf("tab: %d\n", tab);
printf("&tab[7]: %d\n", &tab[7]);
/* Note: tab must be cast to an integer else C will do pointer
math on it. `7 + tab` gives the same result. */
printf("(7 * sizeof(int)) + (int)tab: %d\n", (7 * sizeof(int)) + (int)tab);
printf("7 + tab: %d\n", 7 + tab);
}
The exact results will vary, but you'll see that &tab[7] is just some math done on tab to figure out what memory address to examine.
$ ./test
sizeof(int): 4
tab: 1595446448
&tab[7]: 1595446476
(7 * sizeof(int)) + (int)tab: 1595446476
7 + tab: 1595446476
1595446476 - 1595446448 is 28. 7 * 4 is 28.
An array in C is just a pointer to a block of memory with a starting point at, in this case, the arbitrary location of tab[0]. Sure you've set a bound of 4 but if you go past that, you just accessing random values that are past that block of memory. (i.e. the reason it is probably printing out weird numbers).
That is my code:
#include<stdio.h>
int main()
{
int vet[10], i;
for(i=30; i<=45; i++)
{
scanf("%d", &vet[i]);
}
for(i=30; i<=45; i++)
printf(" %d ", vet[i]);
for(i=30; i<=45; i++)
printf(" %x", &vet[i]);
return 0;
}
I declared just 10 positions of int type on memory, but i get more, so what happened ?
it is a memory overflow ?
and the type %x is correctly to print the memory adress ?
the imput was:
1
2
3
4
5
6
7
8
9
10 /*It was to be stoped right here !?*/
11
12
13
14
15
16
and returned:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 /*I put space to indent*/
22ff6c 22ff70 22ff74 22ff78 22ff7c 22ff80 22ff84 22ff88 22ff8c 22ff90 22ff94 22ff98 22ff9c 22ffa0 22ffa4 22ffa8
The C language does not check bounds when you access arrays for reading or writing. It is up to the program author to ensure that the program accesses only valid array elements.
In this case, you wrote values to memory addresses outside your declared array. While you may sometimes get a segmentation violation (SIGSEGV) in this case, you may just get "lucky" -- really, unlucky -- and not encounter any problems at runtime.
C doesn't enforce array boundaries. Keeping within the limits is your responsibility in that language - it will let you do plainly wrong things, but it may crash at runtime.
Not only does the C language not check bounds on array accesses with respect to array size, which explains why you are successfully writing to the array 15 times, but C also does not have a mechanism for converting your range of 30 to 45 into the range of the first 10 (or 15?) elements of the array.
So, you are really attempting to write to the 31st through 46th element of the array vet, which has only 10 elements.
C is perfectly happy to let you read from and write to an array past the bounds you set (10, in this case).
Reading past the limit just gives you garbage; writing past it will do all kinds of crazy things and generally crash your program (or, if you are unlucky, overwrite your entire hard drive).
You were lucky with this program, but you should not keep doing that. In C, you are responsible for enforcing the limits of your arrays yourself.
int vet[10] declares a block of ten integers in memory. These memory locations are accessed via vet[0] through vet[9]. Any other access to memory through vet is undefined behavior. Absolutely anything could be within that memory, and you can easily corrupt the rest of your program execution. The compiler trusts you to know better than what you were doing.
As #NigelHarper correctly points out, %p is the official way of printing pointers. It prints in hexadecimal. Pointers could print in decimal, but the number itself is meaningless. Hexadecimal makes the printing more concise, and just as easy to see differences from one address to the next.
It is also possible to use %x for printing a pointer, since all that does is take a value and print it in hexadecimal form.
C does not do bounds checking on arrays and you are accessing an array out of bounds. The possible valid indexes in the array are [0,9], but you are accessing [30,45].
You should modify your code to only access valid indexes:
int SIZE = 10;
int vet[SIZE];
//...
// not for( i = 30; i <= 45; i++ )
for( i = 0; i < SIZE; ++i ) { /* ... */ }
C Language doesn't have support to check the out of bound array accesses. IN c++, if you try to access out of bound array memory location, it will generate Segmentation Fault which causes your process to terminate. As, C doesn't allow it, it is expected behavior.
I'm going through the K & R book and the answer to one of the exercises is troubling me.
In the solutions manual, exercise 1-22 declares a char array:
#define MAXCOL 10
char line[MAXCOL];
so my understanding is that in C arrays go from 0 ... n-1. If that's the case then the above declaration should allocate memory for a char array of length 10 starting with 0 and ending with 9. More to the point line[10] is out of bounds according to my understanding? A function in the sample program is eventually passed a integer value pos that is equal to 10 and the following comparison takes place:
int findblnk(int pos) {
while(pos > 0 && line[pos] != ' ')
--pos;
if (pos == 0) //no blanks in line ?
return MAXCOL;
else //at least one blank
return pos+1; //position after blank
}
If pos is 10 and line[] is only of length 10, then isn't line[pos] out of bounds for the array?
Is it okay to make comparisons this way in C, or could this potentially lead to a segmentation fault? I am sure the solutions manual is right this just really confused me. Also I can post the entire program if necessary. Thanks!
Thanks for the speedy and very helpful responses, I guess it is definitely a bug then. It is called through the following branch:
else if (++pos >= MAXCOL) {
pos = findblnk(pos);
printl(pos);
pos = newpos(pos);
}
MAXCOL is defined as 10 as stated above. So for this branch findblnk(pos) pos would be passed 10 as a minimum.
Do you think the solution manual for K & R is worth going through or is it known for having buggy code examples?
It is never, ever okay to over-run the bounds of an array in C. (Or any language really).
If 10 is really passed to that function, that is certainly a bug. While there are better ways of doing it, that function should at least verify that pos is within the bounds of line before attempting to use it as an index.
If pos is indeed 10 then it would be an out of bounds access and accessing an array out of bounds is undefined behavior and therefore anything can happen even a program that appears to work properly at the time, the results are unreliable. The draft C99 standard Annex J.2 undefined behavior contains the follows bullet:
An array subscript is out of range, even if an object is apparently accessible with the
given subscript (as in the lvalue expression a[1][7] given the declaration int
a[4][5]) (6.5.6).
I don't have a copy of K&R handy but the errata does not list anything for this problem. My best guess is the condition should < instead of >=.
Code above is fine as long as pos == 9 when its passed to that function . If pos ==10 when its passed then its undefined behaviour and .. you are correct , it should be avoided.
However it may or may not give segmentation fault .
my_type buffer[SOME_CONSTANT_NAME]; almost always is a bug.
Code like the one you present in the question is the source of the majority of security problems: when the buffer overflows, it invokes undefined behaviour, and that undefined behaviour (if it does not directly crash the program) can frequently be exploited by attackers to execute their own code within your process.
So, my advice is to stay away from all fixed buffer sizes and either use C++'s std::vector<> or dynamically allocate enough memory to fit. The Posix 2008 standard makes this quite easy even in C with the asprintf() function and friends.
I came across this code accidentally:
#include<stdio.h>
int main()
{
int i;
int array[3];
for(i=0;i<=3;i++)
array[i]=0;
return 0;
}
On running this code my terminal gets hanged - the code is not terminating.
When I replace 3 by 2 code runs successfully and terminates without a problem.
In C there is no bound checking on arrays, so what's the problem with the above code that is causing it to not terminate?
Platform - Ubuntu 10.04
Compiler - gcc
Just because there's no bound checking doesn't mean that there are no consequences to writing out of bounds. Doing so invokes Undefined Behavior, so there's no telling what may happen.
This time, on this compiler, on this architecture, it happens that when you write to array[3], you actually set i to zero, because i was positioned right after array on the stack.
Your code is reading beyond the bound of array and causing an Undefined Behavior.
When you declare an array of size 3. The valid index range is from 0 to 2.
While your loop runs from 0 to 3.
If you access anything beyond the valid range of an array then it is Undefined Behavior and your program may hang or crash or show any behavior. The c standard does not mandate any specific behavior in such cases.
When you say C does not do bounds checking it actually means that it is programmers responsibility to ensure that their programs do not access beyond the beyonds of the allocated array and failing to do so results in all safe bets being off and any behavior.
int array[3];
This declares an array of 3 ints, having indices 0, 1, and 2.
for(i=0;i<=3;i++)
array[i]=0;
This writes four ints into the array, at indices 0, 1, 2, and 3. That's a problem.
Nobody here can tell exactly what you're seeing -- you haven't even specified what platform you're working on. All we can say is that the code is broken, and that leads to whatever result you're seeing. One possibility is that i is stored right after array, so you end up setting i back to 0 when you do array[3]=0;. But that's just a guess.
The highest valid index for array is 2. Writing past that index invokes undefined behaviour.
What you're seeing is a manifestation of the undefined behaviour.
Contrast this with the following two snippets, both of which are correct:
/* 1 */
int array[3];
for(i=0;i<3;i++) { array[i] = 0; }
/* 2 */
int array[4];
for(i=0;i<4;i++) { array[i] = 0; }
You declared array of size 3 which means (0,1,2 are the valid indexes)
if you try to set 0 to some memory location which is not for us unexpected (generally called UB undefined behavior) things can happen
The elements in an array are numbered 0 to (n-1). Your array has 3 spots, but is initializing 4 location (0, 1, 2, 3). Typically, you'd have you for loop say i < 3 so that your numbers match, but you don't go over the upper bound of the array.