I get the outcome for squares
squares = [ 512, 1, 4, 9, 16, 25, 36, 49 ].
I know I reached the boundaries of my limit but where did 512 come from? Can you give me an explanation of all the individual steps involved in the error occurring?
int main()
{
unsigned squares[8];
unsigned cubes[8];
for (int i = 0; i <= 8; i++) {
squares[i] = i * i;
cubes[i] = i * i * i;
}
}
I would say the same thing as everyone is saying. It is undefined behavior and you should not do that.
Now, you would ask whether the undefined behavior is the reason for whatever is happening.
I would say, Yes, probably.
Now, you may think this is an easy escape from answering the question.
It may be, but..
The main problem with these kind of question is, it is very hard to re-create the same case and investigate what actually made it behave the way it did. Because undefined behavior, well, behaves in a very undefined way.
That is the reason people do not try answering these kind of questions, and people advise to stay away from undefined behavior territory.
I know i reached the boundaries of my limit
Then you should know the consequences, too. Accessing out of bound memory invokes undefined behavior.
Stay within the valid memory limit. Use
for (int i = 0; i <8; i++)
You're accessing memory beyond the limit
for (int i = 0; i <= 8; i++)
should be
for (int i = 0; i <8; i++)
Remember unsigned squares[8]; allows you to legally access squares[0] upto squares[7]
I know i reached the boundaries of my limit but where did 512 come
from.
The consequences of illegal memory access is undefined as per ISO/IEC 9899:201x 6.5.10->fn109
Two objects may be adjacent in memory because they are adjacent
elements of a larger array or adjacent members of a structure with no
padding between them, or because the implementation chose to place
them so, even though they are unrelated. If prior invalid pointer
operations (such as accesses outside array bounds) produced undefined
behavior, subsequent comparisons also produce undefined behavior
You may use a debugger(say gdb) or an instrumentation framework(say valgrind) to find where the value came from. Here 512 looks like the cube of 8 but there is no guarantee that you will get the same value on the next run. Morever, there is a chance that the program might crash.
Related
In C99, is there a big different between these two?:
int main() {
int n , m;
scanf("%d %d", &n, &m);
int X[n][m];
X[n-1][m-1] = 5;
printf("%d", X[n-1][m-1]);
}
and:
int main(int argc, char *argv[]) {
int n , m;
int X[n][m];
scanf("%d %d", &n, &m);
X[n-1][m-1] = 5;
printf("%d", X[n-1][m-1]);
}
The first one seems to always work, whereas the second one appears to work for most inputs, but gives a segfault for the inputs 5 5 and 6 6 and returns a different value than 5 for the input 9 9. So do you need to make sure to get the values before declaring them with variable length arrays or is there something else going on here?
When the second one works, it's pure chance. The fact that it ever works proves that, thankfully, compilers can't yet make demons fly out of your nose.
Declaring a variable doesn't necessarily initialize it. int n, m; leaves both n and m with undefined values in this case, and attempting to access those values is undefined behavior. If the raw binary data in the memory those point to happen to be interpreted into a value larger than the values entered for n and m -- which is very, very far from guaranteed -- then your code will work; if not, it won't. Your compiler could also have made this segfault, or made it melt your CPU; it's undefined behavior, so anything can happen.
For example, let's say that the area of memory that the compiler dedicates to n happened to contain the number 10589231, and m got 14. If you then entered an n of 12 and an m of 6, you're golden -- the array happens to be big enough. On the other hand, if n got 4 and m got 2, then your code will look past the end of the array, and you'll get undefined behavior -- which might not even break, since it's entirely possible that the bits stored in four-byte segments after the end of the array are both accessible to your program and valid integers according to your compiler/the C standard. In addition, it's possible for n and m to end up with negative values, which leads to... weird stuff. Probably.
Of course, this is all fluff and speculation depending on the compiler, OS, time of day, and phase of the moon,1 and you can't rely on any numbers happening to be initialized to the right ones.
With the first one, on the other hand, you're assigning the values through scanf, so (assuming it doesn't error) (and the entered numbers aren't negative) (or zero) you're going to have valid indices, because the array is guaranteed to be big enough because the variables are initialized properly.
Just to be clear, even though variables are required to be zero-initialized under some circumstances doesn't mean you should rely on that behavior. You should always explicitly give variables a default value, or initialize them as soon as possible after their declaration (in the case of using something like scanf). This makes your code clearer, and prevents people from wondering if you're relying on this type of UB.
1: Source: Ryan Bemrose, in chat
int X[n][m]; means to declare an array whose dimensions are the values that n and m currently have. C code doesn't look into the future; statements and declarations are executed in the order they are encountered.
In your second code you did not give n or m values, so this is undefined behaviour which means that anything may happen.
Here is another example of sequential execution:
int x = 5;
printf("%d\n", x);
x = 7;
This will print 5, not 7.
The second one should produce bugs because n and m are initialized with pretty much random values if they're local variables. If they're global, they'll be with value 0.
int main(){
int i;
int arr[4];
for(int i=0; i<=4; i++){
arr[i] = 0;
}
return 0;
}
I watched a video on youtube of CS107(lecture 13) in which this example is shown and told why the above program will lead to infinite loop by showing memory diagrams. arr[4] goes out of bounds and should lead to an address where i is stored and changing the value of i back to 0 hence leading to infinite loop. But when I tried to run this on my mac using gcc compiler, for loop executed (checked by inserting printf) 5 times. i.e for the value of i = 0,1,2,3,4.
When there is undefined behavior you can't expect a single or particular behavior. Anything could happen.
What said in that video is just one of many possibilities of undefined behavior and what you are getting is another possibility. One should not rely upon any particular behavior.
first: it is undefined behaviour, anything can happen.
said this, you should try different optimization levels. one optimization can lead to loop-reduction:
for (i = 0; i <= 2; i++) arr[i] = 0;
can be reduced, because i and arr[i] are invariant in the scope of the loop, to
arr[0] = 0;
arr[1] = 0;
arr[2] = 0;
that would lead to your tested result.
Another thing to consider is the architecture which arranges the stack and places the items onto the stack (you cannot rely on the ordering of the items)
I'm creating the below array:
int p[100];
int
main ()
{
int i = 0;
while (1)
{
p[i] = 148;
i++;
}
return (0);
}
The program aborts with a segmentation fault after writing 1000 positions of the array, instead of the 100. I know that C doesn't check if the program writes out of bounds, this is left to the OS. I'm running it on ubuntu, the size of the stack is 8MB (limit -s). Why is it aborting after 1000? How can I check how much memory my OS allocates for an array?
Sorry if it's been asked before, I've been googling this but can't seem to find a specific explanation for this.
Accessing an invalid memory location leads to Undefined Behavior which means that anything can happen. It is not necessary for a segmentation-fault to occur.
...the size of the stack is 8MB (limit -s)...
variable int p[100]; is not at the stack but in data area because it is defined as global. It is not initialized so it is placed into BSS area and filled with zeros. You can check that printing array values just at the beginning of main() function.
As other said, using p[i] = 148; you produced undefined behaviour. Filling 1000 position you most probably reached end of BSS area and got segmentation fault.
It appear that you clearly get over the 100 elements defined (int p[100];) since you make a loop without any limitation (while (1)).
I would suggest to you to use a for loop instead:
for (i = 0; i < 100; i++) {
// do your stuff...
}
Regarding you more specific question about the memory, consider that any outside range request (in your situation over the 100 elements of the array) can produce an error. The fact that you notice it was 1000 in your situation can change depending on memory usage by other program.
It will fail once the CPU says
HEY! that's not Your memory, leave it!
The fact that the memory is not inside of the array does not mean that it's not for the application to manipulate.
The program aborts with a segmentation fault after writing 1000 positions of the array, instead of the 100.
You do not reason out Undefined Behavior. Its like asking If 1000 people are under a coconut tree, will 700 hundred of them always fall unconscious if a Coconut smacks each of their heads?
I'm going through the K & R book and the answer to one of the exercises is troubling me.
In the solutions manual, exercise 1-22 declares a char array:
#define MAXCOL 10
char line[MAXCOL];
so my understanding is that in C arrays go from 0 ... n-1. If that's the case then the above declaration should allocate memory for a char array of length 10 starting with 0 and ending with 9. More to the point line[10] is out of bounds according to my understanding? A function in the sample program is eventually passed a integer value pos that is equal to 10 and the following comparison takes place:
int findblnk(int pos) {
while(pos > 0 && line[pos] != ' ')
--pos;
if (pos == 0) //no blanks in line ?
return MAXCOL;
else //at least one blank
return pos+1; //position after blank
}
If pos is 10 and line[] is only of length 10, then isn't line[pos] out of bounds for the array?
Is it okay to make comparisons this way in C, or could this potentially lead to a segmentation fault? I am sure the solutions manual is right this just really confused me. Also I can post the entire program if necessary. Thanks!
Thanks for the speedy and very helpful responses, I guess it is definitely a bug then. It is called through the following branch:
else if (++pos >= MAXCOL) {
pos = findblnk(pos);
printl(pos);
pos = newpos(pos);
}
MAXCOL is defined as 10 as stated above. So for this branch findblnk(pos) pos would be passed 10 as a minimum.
Do you think the solution manual for K & R is worth going through or is it known for having buggy code examples?
It is never, ever okay to over-run the bounds of an array in C. (Or any language really).
If 10 is really passed to that function, that is certainly a bug. While there are better ways of doing it, that function should at least verify that pos is within the bounds of line before attempting to use it as an index.
If pos is indeed 10 then it would be an out of bounds access and accessing an array out of bounds is undefined behavior and therefore anything can happen even a program that appears to work properly at the time, the results are unreliable. The draft C99 standard Annex J.2 undefined behavior contains the follows bullet:
An array subscript is out of range, even if an object is apparently accessible with the
given subscript (as in the lvalue expression a[1][7] given the declaration int
a[4][5]) (6.5.6).
I don't have a copy of K&R handy but the errata does not list anything for this problem. My best guess is the condition should < instead of >=.
Code above is fine as long as pos == 9 when its passed to that function . If pos ==10 when its passed then its undefined behaviour and .. you are correct , it should be avoided.
However it may or may not give segmentation fault .
my_type buffer[SOME_CONSTANT_NAME]; almost always is a bug.
Code like the one you present in the question is the source of the majority of security problems: when the buffer overflows, it invokes undefined behaviour, and that undefined behaviour (if it does not directly crash the program) can frequently be exploited by attackers to execute their own code within your process.
So, my advice is to stay away from all fixed buffer sizes and either use C++'s std::vector<> or dynamically allocate enough memory to fit. The Posix 2008 standard makes this quite easy even in C with the asprintf() function and friends.
I came across this code accidentally:
#include<stdio.h>
int main()
{
int i;
int array[3];
for(i=0;i<=3;i++)
array[i]=0;
return 0;
}
On running this code my terminal gets hanged - the code is not terminating.
When I replace 3 by 2 code runs successfully and terminates without a problem.
In C there is no bound checking on arrays, so what's the problem with the above code that is causing it to not terminate?
Platform - Ubuntu 10.04
Compiler - gcc
Just because there's no bound checking doesn't mean that there are no consequences to writing out of bounds. Doing so invokes Undefined Behavior, so there's no telling what may happen.
This time, on this compiler, on this architecture, it happens that when you write to array[3], you actually set i to zero, because i was positioned right after array on the stack.
Your code is reading beyond the bound of array and causing an Undefined Behavior.
When you declare an array of size 3. The valid index range is from 0 to 2.
While your loop runs from 0 to 3.
If you access anything beyond the valid range of an array then it is Undefined Behavior and your program may hang or crash or show any behavior. The c standard does not mandate any specific behavior in such cases.
When you say C does not do bounds checking it actually means that it is programmers responsibility to ensure that their programs do not access beyond the beyonds of the allocated array and failing to do so results in all safe bets being off and any behavior.
int array[3];
This declares an array of 3 ints, having indices 0, 1, and 2.
for(i=0;i<=3;i++)
array[i]=0;
This writes four ints into the array, at indices 0, 1, 2, and 3. That's a problem.
Nobody here can tell exactly what you're seeing -- you haven't even specified what platform you're working on. All we can say is that the code is broken, and that leads to whatever result you're seeing. One possibility is that i is stored right after array, so you end up setting i back to 0 when you do array[3]=0;. But that's just a guess.
The highest valid index for array is 2. Writing past that index invokes undefined behaviour.
What you're seeing is a manifestation of the undefined behaviour.
Contrast this with the following two snippets, both of which are correct:
/* 1 */
int array[3];
for(i=0;i<3;i++) { array[i] = 0; }
/* 2 */
int array[4];
for(i=0;i<4;i++) { array[i] = 0; }
You declared array of size 3 which means (0,1,2 are the valid indexes)
if you try to set 0 to some memory location which is not for us unexpected (generally called UB undefined behavior) things can happen
The elements in an array are numbered 0 to (n-1). Your array has 3 spots, but is initializing 4 location (0, 1, 2, 3). Typically, you'd have you for loop say i < 3 so that your numbers match, but you don't go over the upper bound of the array.