C dynamic array element access - c

I'm learning C and trying to build an dynamic array. I found a great tutorial on this but I don't get it all the way. The code I have now is
typedef struct{
int size;
int capacity;
char *data;
}Brry;
void brry_init(Brry *brry){
brry->size = 0;
brry->capacity = 2;
brry->data = (char *)calloc(brry->capacity, sizeof(char));
}
void brry_insert(Brry *brry, char value){
brry->data[brry->size++] = value; //so do check here if I have enough memory, but checking something out
}
int main(void){
Brry brry;
brry_init(&brry);
for (int i = 0; i < 3; i++) {
brry_insert(&brry, 'a');
}
printf("%c\n", brry.data[2]);
return 0;
}
In my main function I add 3 element to the array, but it only allocated for 2. But when I print it it works just fine? I expected some strange value to be printed. Why is this or am I doing something wrong?

You are writing into a buffer you didn't allocate enough memory for. That it works is not guaranteed.
What you're trying now is to read from some junk value in memory, who knows, which sometimes leads to a segmentation fault and other times you are lucky and get some junk value, and it doesn't segfault.
Writing into junk memory will invoke undefined behavior, so better watch it.
If you do get errors it will almost always be a segfault, short for segmentation fault.
Read up on it here.
The technical for what you're doing by reading past the bounds of the array is called derefencing a pointer. You might also want to read more about that here.

Yes, you are indeed writing to the third element of a two element array. This means your program will exhibit undefined behavior and you have no guarantee of what is going to happen. In your case you got lucky and the program "worked", but you might not always be so lucky.

Trying to read/write past the end of the array results in undefined behaviour. Exactly what happens depends on several factors which you cannot predict or control. Sometimes, it will seem to read and/or write successfully without complaining. Other times, it may fail horribly and effectively crash your program.
The critical thing is that you should never try to use or rely on undefined behaviour. It's unfortunately a common rookie mistake to think that it will always work because one test happened to succeed. That's definitely not the case, and is a recipe for disaster sooner or later.

Related

Dynamic array without malloc?

I was reading through some source code and found a functionality that basically allows you to use an array as a linked list? The code works as follows:
#include <stdio.h>
int
main (void)
{
int *s;
for (int i = 0; i < 10; i++)
{
s[i] = i;
}
for (int i = 0; i < 10; i++)
{
printf ("%d\n", s[i]);
}
return 0;
}
I understand that s points to the beginning of an array in this case, but the size of the array was never defined. Why does this work and what are the limitations of it? Memory corruption, etc.
Why does this work
It does not, it appears to work (which is actually bad luck).
and what are the limitations of it? Memory corruption, etc.
Undefined behavior.
Keep in mind: In your program whatever memory location you try to use, it must be defined. Either you have to make use of compile-time allocation (scalar variable definitions, for example), or, for pointer types, you need to either make them point to some valid memory (address of a previously defined variable) or, allocate memory at run-time (using allocator functions). Using any arbitrary memory location, which is indeterminate, is invalid and will cause UB.
I understand that s points to the beginning of an array in this case
No the pointer has automatic storage duration and was not initialized
int *s;
So it has an indeterminate value and points nowhere.
but the size of the array was never defined
There is neither array declared or defined in the program.
Why does this work and what are the limitations of it?
It works by chance. That is it produced the expected result when you run it. But actually the program has undefined behavior.
As I have pointed out first on the comments, what you are doing does not work, it seems to work, but it is in fact undefined behaviour.
In computer programming, undefined behavior (UB) is the result of
executing a program whose behavior is prescribed to be unpredictable,
in the language specification to which the computer code adheres.
Hence, it might "work" sometimes, and sometimes not. Consequently, one should never rely on such behaviour.
If it would be that easy to allocate a dynamic array in C what would one use malloc?! Try it out with a bigger value than 10 to increase the likelihood of leading to a segmentation fault.
Look into the SO Thread to see the how to properly allocation and array in C.

Using malloc is giving me more memory than expected?

I'm trying to get to grips with malloc, and so far I'm mostly getting unexpected results when testing and playing around with it.
int main(int argc, char** argv)
{
int size = 10;
int *A;
A = (int *)malloc(size * sizeof(int));
for (int i = 0; i < 10000; i++)
{
A[i] = i;
}
for (int i = 0; i < 10000; i++)
{
printf("%d) %d\n", i, A[i]);
}
}
With this example code above for example, the code runs without an error. Even though I only allocated A to hold 10 * int, so I expected the loop to only run 10 times before hitting an error. If I increment the loop to about 30-40k instead, it then hits a segmentation error. However if I increase my size to the loop amount, it would always work like expected. So I know how to avoid the error.. kinda, I was just hoping someone might be kind enough to explain why this is.
Edit: Turned out I didn't appreciate that C doesn't detect out of bounds, I've been looked after way too much by Java and C++. I had undefined behavior and now know it's my job to prevent them. Thanks for everyone that replied.
C isn't required to perform any bounds checking on array access. It can allow you read/write past that without any warning or error.
You're invoking undefined behavior by reading and writing past the end of allocated memory. This means the behavior of your program can't be predicted. It could crash, it could output strange results, or it could (as in your case) appear to work properly.
Just because the program can crash doesn't mean it will.
The codes runs without an error, but it is still wrong. You just do not notice it. Your loop runs out of the allocated area, but the system remains unaware of that fact until you run out of a much larger area your program can potentially access.
Picture it this way:
<UNAVAILABLE><other data>1234567890<other data><UNAVAILABLE>
Your 10 ints are in the middle of other data, which you can read and even write - to very unpleasant effects. C is not holding your hand here - only once you go out of the total available memory, the program will crash, not before.
Undefined behavior doesn't mean "guaranteed segmentation fault"; it may work in some cases.
There is no way of knowing how far beyond an array's bounds you can go before you finally crash; even dereferencing one element beyond a boundary is undefined behavior.
Also: if malloc succeeds, it will allocate at least as much space as you requested, possibly more.

How does a program shut down when reading farther than memory allocated to an array?

Good evening everybody, I am learning C++ on Dev C++ 5.9.2, I am really novice at it. I intentionnally make my programs crash to get a better understanding of bugs. I've just learned that we can pass a char string to a function by initializing a pointer with the address of the array and that was the only way to do it. Therefore we should always pass to the function the size of that string to handle it properly. It also means that any procedure can run with a wrong size passed in the argument line hence I supposed we could read farther than the allocated memory assigned to the string.
But how far can we do it? I've tested several integers and apparently it works fine below 300 bytes but it doesn't for above 1000 (the program displays characters but end up to crash). So my questions are :
How far can we read or write on the string out of its memory range?
Is it called an overflow?
How does the program detect that the procedure is doing something unlegit?
Is it, the console or the code behind 'cout', that conditions the shutting down of the program?
What is the condition for the program to stop?
Does the limit depend on the console or the OS?
I hope my questions don't sound too trivial. Thank you for any answer. Good day.
#include <iostream>
using namespace std;
void change(char str[])
{
str[0] = 'C';
}
void display(char str[], int lim)
{
for(int i = 0; i < lim; i++) cout << str[i];
}
int main ()
{
char mystr[] = "Hello.";
change(mystr);
display(mystr, 300);
system("PAUSE");
return 0;
}
The behavior when you read past the end of an array is undefined. The compiler is free to implement the read operation in whatever way works correctly when you don't read beyond the end of the buffer, and then if you do read too far - well, whatever happens is what happens.
So there are no definite answers to most of your questions. The program could crash as soon as you read 1 byte too far, or you could get random data as you read several megabytes and never crash. If you crash, it could be for any number of reasons - though it likely will have something to do with how memory is managed by the OS.
As an aside, the normal way to let a function know where a string ends is to end it with a null character rather than passing a separate length value.

C - Off by one error, but no segmentation fault?

I recently wrote this code in C:
#include <stdio.h>
#define N_ROWS 100
int main() {
char *inputFileName = "triangle_data.txt";
FILE *inputFile = fopen(inputFileName, "r");
if (inputFile == NULL) {
printf("ERROR: Failed to open \"%s\".\n", inputFileName);
return -1;
}
int triangle[(N_ROWS*(N_ROWS+1))/2 - 1];
size_t size = sizeof(triangle)/sizeof(int);
size_t index;
for (index = 0; !feof(inputFile); ++index) {
fscanf(inputFile, "%d", &triangle[index]);
}
return 1;
}
and was expecting a Segmentation Fault, since N_ROWS*(N_ROWS+1))/2 is just enough space to hold the data in the file, but as you can see I made the array one element smaller. Somehow this doesn't trigger a segmentation fault. It does if I replace the body of the for-loop with:
int tmp;
fscanf(inputFile, "%d", &tmp);
triangle[index] = tmp;
What is happening here. If I make the array three elements to small it still doesn't trigger a segmentation fault. Five elements to small trigger one. I'm sure there is enough data in the file.
As a test I printed the array afterwards and if I choose a smaller array there were elements missing.
What is happening here?
PS: Compiled with clang on a OS X.
A segmentation fault doesn't mean that you accessed an array out of bounds, it means that you've accessed a virtual memory address that isn't mapped. Often accessing an array out of bounds will cause this, but just because you aren't seeing a segfault it doesn't mean that all of your memory accesses are valid.
As to why you're seeing the different behavior, it's hard to say and it isn't necessarily a worthwhile use of time to try justifying different results when the results are specified as undefined. If you're really curious about what's going on you could look at the assembly generated by the two versions of your code (use the --save-temps argument to clang).
What is happening here?
Your program invokes undefined behavior as you are writing outside your array object. Undefined behavior in C is undefined, your program can work today and crash all the other days or even print Shakespeare complete works.
The behaviour of your program (accessing an array element out of bounds) is undefined.
There is no particular requirement that undefined behaviour result in a segmentation fault, or any other observable error condition.
Undefined behaviour means - literally - that the C standard does not impose any restrictions on what is allowed to occur. That means anything can happen, including appearing to work correctly, or working in one circumstance but not another.
The trick is not to worry about the particular potential causes of segmentation faults (or any other error condition that any instance of undefined behaviour might trigger). It is to ensure the program has well-defined behaviour, so such symptoms are guaranteed not to occur.

Accessing non-existing array element

int main(int argc, char const *argv[])
{
int anArray[5];
anArray[0] = 54;
anArray[1] = 54;
anArray[2] = 54;
anArray[3] = 54;
anArray[4] = 54;
anArray[5] = 54;
anArray[6] = 54;
anArray[7] = 54;
printf ("%i\n", anArray[7]);
return 0;
}
This prints 54.
How does this even work? We say that C arrays are not dynamic. Why should this even compile? or even if it compiles, it should throw a seg fault.
I have defined an array of 5 elements, then I accessed elements 5,6,7. Why is it possible to assign a value to, for example, anArray[5]?
Please note that I have a c++ background and I haven't used this kind of array for a long time.
You are scribbling into memory that you don't own, so anything could happen. You got lucky and the computer let you write and then read the value in that location. But it's just luck: the behavior is undefined.
Note that the exact same thing applies to C++ (since you mentioned it), not only with C-style arrays but also with std::vector::operator[] and std::array in C++11. In C++ you can use vec.at(idx) instead of vec[idx] to do bounds checking always.
The language itself doesn't say the runtime or the compiler has to check you're actually accessing elements inside the bounds of the array. The compiler could emit a warning, but that's it. You are responsible for accessing valid elements. Not doing so results in undefined behavior, which means anything can happen, including appearing to work.
You're basically reading into memory to places where you don't know what's there. This can be a useful thing in C (if you really know what you're doing) but also can get you hours of frustrating debugging because it is undefined behaviour what's going to happen there.
From wikipedia:
Many programming languages, such as C, never perform automatic bounds checking to raise speed. However, this leaves many off-by-one errors and buffer overflows uncaught. Many programmers believe these languages sacrifice too much for rapid execution.
No compiler error: as no compiler error related issues.
Run time error: because of undefined behavior and you are lucky that the memory location you
were trying to access was free at that time !

Resources