Accessing non-existing array element - c

int main(int argc, char const *argv[])
{
int anArray[5];
anArray[0] = 54;
anArray[1] = 54;
anArray[2] = 54;
anArray[3] = 54;
anArray[4] = 54;
anArray[5] = 54;
anArray[6] = 54;
anArray[7] = 54;
printf ("%i\n", anArray[7]);
return 0;
}
This prints 54.
How does this even work? We say that C arrays are not dynamic. Why should this even compile? or even if it compiles, it should throw a seg fault.
I have defined an array of 5 elements, then I accessed elements 5,6,7. Why is it possible to assign a value to, for example, anArray[5]?
Please note that I have a c++ background and I haven't used this kind of array for a long time.

You are scribbling into memory that you don't own, so anything could happen. You got lucky and the computer let you write and then read the value in that location. But it's just luck: the behavior is undefined.
Note that the exact same thing applies to C++ (since you mentioned it), not only with C-style arrays but also with std::vector::operator[] and std::array in C++11. In C++ you can use vec.at(idx) instead of vec[idx] to do bounds checking always.

The language itself doesn't say the runtime or the compiler has to check you're actually accessing elements inside the bounds of the array. The compiler could emit a warning, but that's it. You are responsible for accessing valid elements. Not doing so results in undefined behavior, which means anything can happen, including appearing to work.

You're basically reading into memory to places where you don't know what's there. This can be a useful thing in C (if you really know what you're doing) but also can get you hours of frustrating debugging because it is undefined behaviour what's going to happen there.
From wikipedia:
Many programming languages, such as C, never perform automatic bounds checking to raise speed. However, this leaves many off-by-one errors and buffer overflows uncaught. Many programmers believe these languages sacrifice too much for rapid execution.

No compiler error: as no compiler error related issues.
Run time error: because of undefined behavior and you are lucky that the memory location you
were trying to access was free at that time !

Related

Dynamic array without malloc?

I was reading through some source code and found a functionality that basically allows you to use an array as a linked list? The code works as follows:
#include <stdio.h>
int
main (void)
{
int *s;
for (int i = 0; i < 10; i++)
{
s[i] = i;
}
for (int i = 0; i < 10; i++)
{
printf ("%d\n", s[i]);
}
return 0;
}
I understand that s points to the beginning of an array in this case, but the size of the array was never defined. Why does this work and what are the limitations of it? Memory corruption, etc.
Why does this work
It does not, it appears to work (which is actually bad luck).
and what are the limitations of it? Memory corruption, etc.
Undefined behavior.
Keep in mind: In your program whatever memory location you try to use, it must be defined. Either you have to make use of compile-time allocation (scalar variable definitions, for example), or, for pointer types, you need to either make them point to some valid memory (address of a previously defined variable) or, allocate memory at run-time (using allocator functions). Using any arbitrary memory location, which is indeterminate, is invalid and will cause UB.
I understand that s points to the beginning of an array in this case
No the pointer has automatic storage duration and was not initialized
int *s;
So it has an indeterminate value and points nowhere.
but the size of the array was never defined
There is neither array declared or defined in the program.
Why does this work and what are the limitations of it?
It works by chance. That is it produced the expected result when you run it. But actually the program has undefined behavior.
As I have pointed out first on the comments, what you are doing does not work, it seems to work, but it is in fact undefined behaviour.
In computer programming, undefined behavior (UB) is the result of
executing a program whose behavior is prescribed to be unpredictable,
in the language specification to which the computer code adheres.
Hence, it might "work" sometimes, and sometimes not. Consequently, one should never rely on such behaviour.
If it would be that easy to allocate a dynamic array in C what would one use malloc?! Try it out with a bigger value than 10 to increase the likelihood of leading to a segmentation fault.
Look into the SO Thread to see the how to properly allocation and array in C.

Using malloc is giving me more memory than expected?

I'm trying to get to grips with malloc, and so far I'm mostly getting unexpected results when testing and playing around with it.
int main(int argc, char** argv)
{
int size = 10;
int *A;
A = (int *)malloc(size * sizeof(int));
for (int i = 0; i < 10000; i++)
{
A[i] = i;
}
for (int i = 0; i < 10000; i++)
{
printf("%d) %d\n", i, A[i]);
}
}
With this example code above for example, the code runs without an error. Even though I only allocated A to hold 10 * int, so I expected the loop to only run 10 times before hitting an error. If I increment the loop to about 30-40k instead, it then hits a segmentation error. However if I increase my size to the loop amount, it would always work like expected. So I know how to avoid the error.. kinda, I was just hoping someone might be kind enough to explain why this is.
Edit: Turned out I didn't appreciate that C doesn't detect out of bounds, I've been looked after way too much by Java and C++. I had undefined behavior and now know it's my job to prevent them. Thanks for everyone that replied.
C isn't required to perform any bounds checking on array access. It can allow you read/write past that without any warning or error.
You're invoking undefined behavior by reading and writing past the end of allocated memory. This means the behavior of your program can't be predicted. It could crash, it could output strange results, or it could (as in your case) appear to work properly.
Just because the program can crash doesn't mean it will.
The codes runs without an error, but it is still wrong. You just do not notice it. Your loop runs out of the allocated area, but the system remains unaware of that fact until you run out of a much larger area your program can potentially access.
Picture it this way:
<UNAVAILABLE><other data>1234567890<other data><UNAVAILABLE>
Your 10 ints are in the middle of other data, which you can read and even write - to very unpleasant effects. C is not holding your hand here - only once you go out of the total available memory, the program will crash, not before.
Undefined behavior doesn't mean "guaranteed segmentation fault"; it may work in some cases.
There is no way of knowing how far beyond an array's bounds you can go before you finally crash; even dereferencing one element beyond a boundary is undefined behavior.
Also: if malloc succeeds, it will allocate at least as much space as you requested, possibly more.

Dynamic struct array in c using calloc [duplicate]

I use following command in C to allocates 80 bytes (in a 64bit system) to d.
double *d = calloc(10, sizeof(double));
And using following loop to initialize d
for (k=0;k<11;k++){
d[k] = k;
}
When I run the program, there is no error. but I feel since the upper limit on k is 11, there should be something wrong as d is array of length 10.
Please let me know why the program is executed with no error.
Thanks in advance.
This is undefined behavior. There might be an error, and it might be silently ignored by the OS, when you break the rules - all bets are off.
What actually happens in the code depends on the OS, the compiler and the architecture you run it on, which might be tolerant to this violation, crash or do something else, the point is - the resulting behavior is undefined.
I believe that C and C++ doesn't do boundary checks with array and pointers as long as it is in the program stack. I believe it throws segmentation fault when access is outside the program stack.

C dynamic array element access

I'm learning C and trying to build an dynamic array. I found a great tutorial on this but I don't get it all the way. The code I have now is
typedef struct{
int size;
int capacity;
char *data;
}Brry;
void brry_init(Brry *brry){
brry->size = 0;
brry->capacity = 2;
brry->data = (char *)calloc(brry->capacity, sizeof(char));
}
void brry_insert(Brry *brry, char value){
brry->data[brry->size++] = value; //so do check here if I have enough memory, but checking something out
}
int main(void){
Brry brry;
brry_init(&brry);
for (int i = 0; i < 3; i++) {
brry_insert(&brry, 'a');
}
printf("%c\n", brry.data[2]);
return 0;
}
In my main function I add 3 element to the array, but it only allocated for 2. But when I print it it works just fine? I expected some strange value to be printed. Why is this or am I doing something wrong?
You are writing into a buffer you didn't allocate enough memory for. That it works is not guaranteed.
What you're trying now is to read from some junk value in memory, who knows, which sometimes leads to a segmentation fault and other times you are lucky and get some junk value, and it doesn't segfault.
Writing into junk memory will invoke undefined behavior, so better watch it.
If you do get errors it will almost always be a segfault, short for segmentation fault.
Read up on it here.
The technical for what you're doing by reading past the bounds of the array is called derefencing a pointer. You might also want to read more about that here.
Yes, you are indeed writing to the third element of a two element array. This means your program will exhibit undefined behavior and you have no guarantee of what is going to happen. In your case you got lucky and the program "worked", but you might not always be so lucky.
Trying to read/write past the end of the array results in undefined behaviour. Exactly what happens depends on several factors which you cannot predict or control. Sometimes, it will seem to read and/or write successfully without complaining. Other times, it may fail horribly and effectively crash your program.
The critical thing is that you should never try to use or rely on undefined behaviour. It's unfortunately a common rookie mistake to think that it will always work because one test happened to succeed. That's definitely not the case, and is a recipe for disaster sooner or later.

Why am i able to access other variables using array indexing?

Here len is at A[10] and i is at A[11]. Is there a way to catch these errors??
I tried compiling with gcc -Wall -W but no warnings are displayed.
int main()
{
int A[10];
int i, len;
len = sizeof(A) / sizeof(0[A]);
printf("Len = %d\n",len);
for(i = 0; i < len; ++i){
A[i] = i*19%7;
}
A[i] = 5;
A[i + 1] = 6;
printf("Len = %d i = %d\n",len,i);
return 0;
}
Output :
Len = 10
Len = 5 i = 6
You are accessing memory outside the bounds of the array; in C, there is no bounds checking done on array indices.
Accessing memory beyond the end of the array technically results in undefined behavior. This means that there are no guarantees about what happens when you do it. In your example, you end up overwriting the memory occupied by another variable. However, undefined behavior can also cause your application to crash, or worse.
Is there a way to catch these errors?
The compiler can catch some errors like this, but not many. It is often impossible to catch this sort of error at compile-time and report a warning.
Static analysis tools can catch other instances of this sort of error and are usually built to report warnings about code that is likely to cause this sort of error.
C does not generally do bounds-checking, but a number of people have implemented bounds-checking for C. For instance there is a patch for GCC at http://sourceforge.net/projects/boundschecking/. Of course bounds-checking does have some overhead, but it can often be enabled and disabled on a per-file basis.
The array allocation of A is adjacent in memory to i and len. Remember that when you address via an array, it's exactly like using pointers, and you're walking off the end of the array, bumping into the other stuff you put there.
C by default does not do bounds checking. You're expected to be careful as a programmer; in exchange you get speed and size benefits.
Usually external tools, like lint, will catch the problems via static code analysis. Some libraries, depending on compiler vendor, will add additional padding or memory protection to detect when you've walked off the end.
Lots of interesting, dangerous, and non-portable things reside in memory at "random spots." Most of the house keeping for heap memory allocations occur in memory locations before the one the compiler gives you.
The general rule is that if you didn't allocate or request it, don't mess with it.
i's location in memory is just past the end of A. That's not guaranteed with every compiler and every architecture, but most probably wouldn't have a reason to do it any other way.
Note that counting from 0 to 9, you have 10 elements.
Array indexing starts from 0. Hence the size of array is equal to one less than the declared value. You are overwriting the memory beyond what is allowed.
These errors may not be reported as warnings but you can use tools like prevent, sparrow, Klockworks or purify to find such "malpractices" if i may call them that.
The short answer is that local variables are al-located on stack, and indexing is just like *(ptr + index). So it could happen that the room for int y[N] is adjacent to the room for another int x; e.g. x is located after the last y. So, y[N-1] is this last y, while y[N] is the int past the last y, and in this case, by accident, it happens you get x (or whatever in your practical example). But it is absolutely not a sure fact what you can get going past the bounds of an array and so you can't rely on that. Even though undetected, it's a "index out of bound error", and a source of bugs.

Resources